J Gen Virol Email Content Delivery
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Gen Virol 87 (2006), 1805-1821; DOI 10.1099/vir.0.81786-0

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Cowton, V. M.
Right arrow Articles by Fearns, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cowton, V. M.
Right arrow Articles by Fearns, R.
Agricola
Right arrow Articles by Cowton, V. M.
Right arrow Articles by Fearns, R.
© 2006 Society for General Microbiology

Review article

Unravelling the complexities of respiratory syncytial virus RNA synthesis

Vanessa M. Cowton, David R. McGivern and Rachel Fearns

Division of Pathology and Neuroscience, University of Dundee Medical School, Dundee DD1 9SY, UK

Correspondence
Rachel Fearns
r.fearns{at}dundee.ac.uk


   ABSTRACT
TOP
ABSTRACT
REFERENCES
 
Human respiratory syncytial virus (RSV) is the leading cause of paediatric respiratory disease and is the focus of antiviral- and vaccine-development programmes. These goals have been aided by an understanding of the virus genome architecture and the mechanisms by which it is expressed and replicated. RSV is a member of the order Mononegavirales and, as such, has a genome consisting of a single strand of negative-sense RNA. At first glance, transcription and genome replication appear straightforward, requiring self-contained promoter regions at the 3' ends of the genome and antigenome RNAs, short cis-acting elements flanking each of the genes and one polymerase. However, from these minimal elements, the virus is able to generate an array of capped, methylated and polyadenylated mRNAs and encapsidated antigenome and genome RNAs, all in the appropriate ratios to facilitate virus replication. The apparent simplicity of genome expression and replication is a consequence of considerable complexity in the polymerase structure and its cognate cis-acting sequences; here, our understanding of mechanisms by which the RSV polymerase proteins interact with signals in the RNA template to produce different RNA products is reviewed.

Published online ahead of print on 1 March 2006 as DOI 10.1099/vir.0.81786-0.

Background and scope of the review
The World Health Organization estimates that Human respiratory syncytial virus (RSV) is responsible for 64 million infections and 160 000 deaths per annum. Its victims are mostly young infants, but it is increasingly recognized as a significant cause of disease in the elderly population and can often be fatal for patients with compromised immune systems (Collins et al., 2001Down). RSV is a member of the subfamily Pneumovirinae in the family Paramyxoviridae, order Mononegavirales, i.e. the non-segmented, negative-strand RNA viruses. This order includes several exotic pathogens, such as the Ebola and Nipah viruses, and other, more familiar ones, such as the parainfluenza, measles and mumps viruses. Most of our initial understanding of mononegavirus transcription and genome replication stemmed from studies with the paramyxovirus Sendai virus (SeV) and the rhabdovirus Vesicular stomatitis virus (VSV). These viruses can be grown to very high titres, which has facilitated analysis of their RNA-synthesis mechanisms using biochemical techniques. These studies still provide a blueprint for mononegavirus RNA synthesis; however, sequence analysis of different virus genomes and advances in reverse-genetics techniques have opened up this field. Recent investigations have revealed that mononegaviruses differ in the layout of their promoter and gene-junction regions, template structure and polymerase composition. Therefore, although the overall strategy of transcription and replication is similar for all non-segmented, negative-strand RNA viruses, it is possible that the molecular mechanisms that individual viruses use to achieve this differ. Given that RSV is an important human pathogen, it is worth consideration in its own right and, in this review, we aim to examine the data regarding RSV specifically, highlighting where appropriate the similarities and differences with other non-segmented, negative-strand RNA viruses, which might help to identify common themes across the order. In addition, we draw comparisons with well-characterized cellular-transcription models, which provide paradigms for understanding RNA synthesis.

The RSV replication cycle
In the RSV particle, the genome is contained within a helical nucleocapsid, which is associated with virus polymerase proteins and surrounded by matrix protein and an envelope containing viral glycoproteins. The nucleocapsid and polymerase are delivered to the host-cell cytoplasm by direct fusion of the virus particle with the plasma membrane and it is in the cytoplasm that the genome is transcribed and replicated. Transcription results in synthesis of ten monocistronic, capped, methylated and polyadenylated mRNAs, which are translated by the host-cell machinery. Replication involves generation of a complete, positive-sense RNA complement of the genome, the antigenome, which in turn acts as a template for genome synthesis. The genomes are incorporated into nucleocapsids as they are synthesized and can presumably recycle through the RNA-synthesis pathways, or are transported to the plasma membrane for assembly into virus particles [for a detailed review of the RSV replication cycle, see Collins et al. (2001)Down].

In infected cells, the virus polymerase proteins are detected almost exclusively as large, cytoplasmic inclusions (García et al., 1993Down), which are presumed to be accumulations of nucleocapsids functioning as RNA-synthesis ‘factories’. There is evidence that nucleocapsids are associated with lipid-raft structures (McDonald et al., 2004Down; Brown et al., 2005Down) and the cytoskeleton-associated proteins actin and profilin are important for RSV RNA synthesis (García-Barreno et al., 1988Down; Barik, 1992Down; Burke et al., 1998Down, 2000Down; Ulloa et al., 1998Down; Kallewaard et al., 2005Down). These data suggest that RSV RNA-synthesis machinery is integrated intimately with host-cell components and that, when RSV nucleocapsids enter the cell, they co-opt lipid rafts and cytoskeleton proteins to form a structure on which RNA synthesis can occur.

Overview of the RSV transcription and replication strategy
The structure of the RSV genome is shown schematically in Fig. 1Down. The ten viral genes are arranged sequentially and each is flanked by conserved gene-start (GS) and gene-end (GE) sequences, which control the polymerase during transcription (Collins et al., 1986Down, 1987Down; Kuo et al., 1996aDown, 1997Down; Fearns & Collins, 1999aDown). At the 3' and 5' ends of the genome are short extragenic regions: a 44 nt leader (Le) and a 155 nt trailer (Tr), respectively (Collins et al., 1991Down; Mink et al., 1991Down). There is just one promoter for transcription and this lies at the 3' end of the genome and includes sequences in the Le (Dickens et al., 1984Down; Collins et al., 1991Down). The polymerase initiates RNA synthesis at this promoter and then progresses along the length of the genome. When it reaches a GE signal, the polymerase polyadenylates and releases the nascent mRNA. It then reinitiates RNA synthesis at the next GS signal and caps and methylates the 5' end of the RNA. By responding to the GE and GS signals in this fashion, the polymerase is able to generate subgenomic mRNAs. There is a tendency for the polymerase to dissociate from the template at the gene junctions and, because it can only initiate transcription at the 3' end of the genome, this results in a gradient of expression, with the genes at the 3' end of the genome being transcribed more frequently than those at the 5' end (Collins & Wertz, 1983Down; Kuo et al., 1996bDown; Hardy & Wertz, 1998Down; Krempl et al., 2002Down; Cheng et al., 2005Down). In keeping with this gene-expression strategy, the virus genome has evolved such that proteins required in large amounts, such as those involved in disabling innate immune defences, are expressed from the 3' end of the genome, whereas those required in small amounts, such as the polymerase, are expressed from the 5' end (reviewed by Collins et al., 2001Down).


Figure 1
View larger version (13K):
[in this window]
[in a new window]
 
Fig. 1. Schematic diagram (not to scale) depicting the RSV genome and its transcription and replication products. The virus genes are depicted as grey rectangles; the L gene, which comprises almost half of the genome, has been truncated. The GS and GE signals are shown as white and black boxes, respectively. The encoded antigenome and mRNAs are indicated by hatched rectangles. Arrows indicate the location of the promoters.

 
To replicate the genome, the polymerase binds to a promoter in the Le region and initiates RNA synthesis at the 3' terminus. In this case, as the polymerase moves along the template, it does not respond to the cis-acting GE and GS signals and generates a full-length, positive-sense RNA complement of the genome, the antigenome. At the 3' end of the antigenome is the complement of the trailer (TrC), which contains a promoter. The polymerase uses this promoter to generate genome-sense RNA. Unlike mRNA, which is released as naked RNA, the antigenome and genome RNAs are encapsidated with virus nucleoprotein (N) as they are synthesized and concurrent encapsidation probably accounts for the ability of the polymerase to read through the GE signals during replication.

One of the intriguing questions that this strategy provokes is how the polymerase is controlled between its dual functions of transcription and replication. It is possible that there are two pools of polymerase, i.e. transcriptase and replicase, that have different activities, or that initial interactions of the polymerase with the template commit it to one process or the other. These two possibilities are not mutually exclusive and it may be that both come into play. This question underlies many of the studies on mononegavirus transcription and replication and it is possible that it will only be answered completely when we have a good understanding of the structure of the polymerase and the function of the RNA sequences with which it interacts.

General paradigms of RNA synthesis
In the following sections, the cis-acting sequences and viral proteins involved in RSV RNA synthesis are described in detail and models for how they might function are discussed. It is useful to give a brief overview of transcription by the well-characterized DNA-dependent RNA polymerases, as these provide an understanding of how polymerase, template and transcript can interact to execute the different events that occur during RNA synthesis and aid interpretation of the data regarding RSV transcription and replication.

Studies on prokayote transcription have identified the following sequence of events (reviewed by Hsu, 2002Down; Young et al., 2002Down; Browning & Busby, 2004Down; Greive & von Hippel, 2005Down). First, bonds are formed between specific bases in the promoter and the promoter-binding site on the polymerase-initiation complex, securing the polymerase in place. The polymerase initiates RNA synthesis de novo (i.e. without a primer) by positioning the initiating nucleotide and the second NTP in the active site and catalysing formation of the first phosphodiester bond. Polymerization then continues with additional incoming NTPs entering the active site and interacting with the template strand of the DNA. Having initiated RNA synthesis, the polymerase must undergo ‘promoter escape’, in which it relinquishes the bonds that bind it in place. The transcription complex is highly unstable at this stage and abortive transcripts of ~2–10 nt in length are produced in significant amounts; it is thought that these transcripts are produced by polymerase that remains attached to the promoter and initiates RNA synthesis reiteratively before escaping successfully. Following (or concomitant with) promoter escape, the polymerase changes conformation and enters an elongation mode. Even once the polymerase has begun elongation, there is the possibility that it might stall, terminate (i.e. dissociate), backtrack or continue forward at every NTP-addition cycle. The possibility of each outcome depends on a balance of factors and increasing the rate of transcription and/or stabilizing the elongation complex increases polymerase processivity and promotes elongation. Termination takes place if the complex is destabilized and can occur in response to sequences in the template and/or transcript, and can also be effected by trans-acting termination factors. Termination results in dissociation of the polymerase from the template and release of the nascent RNA.

Although most of our understanding of transcription is derived from studies on prokaryotic polymerases, many of the features described above hold for eukaryotic polymerase II transcription (Dvir, 2002Down) and experiments on the positive-strand RNA virus Brome mosaic virus established that its polymerase follows similar steps in initiation, indicating that these paradigms can be applied to the RNA-dependent RNA polymerases (Sun et al., 1996Down; Sun & Kao, 1997aDown, bDown; Adkins et al., 1998Down). RNA synthesis by RSV and other members of the Mononegavirales is more complex than that described for cellular polymerases, for two main reasons. First, rather than having a promoter for each gene, the virus mRNAs are all produced from a single promoter and their synthesis is dependent on the polymerase stopping and restarting RNA synthesis as it moves along the genome. Second, the genome acts as a template for both transcription and replication and the promoters for these two processes overlap and so, somehow, the polymerase must be governed between initiation of transcription or replication. Therefore, although it is reasonable to assume that RNA synthesis by RSV and other non-segmented, negative-strand RNA viruses can be related to the transcription model described above, their strategy of gene expression and genome replication requires variation on these themes, as discussed below.

The core machinery of RSV transcription and RNA replication
The nucleocapsid template
A key feature of both the genome and antigenome templates is that they remain coated with N protein at all times and so it is the nucleocapsid that is the template for RNA synthesis, rather than naked RNA. This has several benefits for the virus; for example, encapsidation prevents formation of secondary structure in the RNA template, obviating the need for a helicase activity. It also protects the RNA from nuclease attack and reduces production of double-stranded RNA, which might help the virus to avoid provoking antiviral responses (Le Mercier et al., 2002Down).

The fact that the RNA is coated in N protein raises the question of how the polymerase is able to ‘see’ and bind to the promoter. Although structural analyses of the nucleocapsid are not yet of sufficient resolution to determine the exact positioning of the RNA relative to N molecules, chemical-modification studies have shown that, in the case of VSV and SeV, the bases of the RNA are exposed in the nucleocapsid, which could potentially allow incoming polymerase complexes to identify the promoter, even though the RNA backbone is encased (Iseni et al., 2000Down, 2002Down). In addition, in some paramyxoviruses, the N protein itself might be a component of the promoter. This is suggested by the fact that many paramyxoviruses obligatorily follow the ‘rule of six’, meaning that their genome nucleotide length must be divisible by 6 to be replicated (Calain & Roux, 1993Down). This reflects the fact that each molecule of N binds to 6 nt and that the promoter sequence in the RNA is recognized in the context of the molecular signature imparted by the N protein (Kolakofsky et al., 1998Down; Vulliémoz & Roux, 2001Down; Iseni et al., 2002Down). Thus, in these viruses, the N component of the nucleocapsid might either contribute directly to promoter structure (i.e. by forming a direct interaction with the polymerase) or might position the RNA bases appropriately for recognition by the polymerase. RSV does not follow the ‘rule of six’ and its genome can be manipulated to be any integer length without loss of template activity (Samal & Collins, 1996Down). These data indicate that the position of N molecules relative to the sequence in the RNA is not important, suggesting that, although the RSV polymerase recognizes the RNA in the context of the nucleocapsid (and might well interact with the N protein during template binding), there are no specific interactions between polymerase and N protein to position the polymerase over the promoter. Following promoter binding, the RNA template must be brought into the active site of the polymerase. How this happens is not known, but either the structure of the nucleocapsid is such that the polymerase can protrude far enough into it to position the RNA into its active site or the polymerase displaces RNA temporarily from the local N molecules. Electron-microscopy studies have indicated that RSV nucleocapsids are highly flexible structures (Bhella et al., 2002Down) and this might allow the nucleocapsid to bend to accommodate the large polymerase complex during RNA synthesis.

Although the termini of the RSV genome are complementary, there is no evidence that complementarity is essential for RNA synthesis or that the termini interact, indicating that the RSV template is a linear, rather than panhandle, structure and that the promoter is confined to the 3' end (Peeples & Collins, 2000Down; Fearns et al., 2002Down; Cowton & Fearns, 2005Down). It was found that a critical feature of the nucleocapsid is the structure afforded by its 3' terminus. If the promoter sequence was displaced from the 3' terminus of the nucleocapsid by a small number of nucleotides (>=6), mRNA and antigenome synthesis was inhibited almost completely (Collins et al., 1991Down; Cowton & Fearns, 2005Down). However, an embedded promoter could be recognized by a polymerase that had already been engaged on the template (Cowton & Fearns, 2005Down). Thus, it appears that the 3' terminus of the nucleocapsid is important for recruitment of polymerase to the promoter, even though it is not necessary for promoter recognition per se. There are at least two possible explanations for this. First, the 3' terminus of the nucleocapsid might combine with the promoter to provide a structure for polymerase assembly. In the case where short (<=5 nt) insertions are made between these elements, the promoter complex might expand its footprint to accommodate the 3' terminus and the promoter simultaneously. Second, it is possible that the polymerase interacts first with the structure conferred by the 3' terminus of the nucleocapsid and then displaces the first few N molecules from the end of the RNA in order to interact with the promoter sequence. In this model, proximity of the promoter to the 3' end is important because displacement of N protein can only occur from the terminus, and short 3' extensions are tolerated because each N monomer binds several nucleotides and, so, displacement of one or two N molecules could still reveal the promoter sequence.

The polymerase complex
The minimal complex required for RSV polymerase activity consists of the large protein (L) and phosphoprotein (P) (Mazumder & Barik, 1994Down). The L–P complex is believed to be responsible for recognition of the promoter, RNA synthesis, capping and methylation of the 5' termini of the mRNAs and polyadenylation of their 3' ends. There is evidence that the L protein contains the enzymic domains required for these processes and its multifunctional nature tallies with its size, with the L gene accounting for almost half of the RSV coding potential (Stec et al., 1991Down). Alignment of sequences of non-segmented, negative-strand RNA virus L proteins identified six conserved regions (I–VI), which are thought to represent functional domains (Poch et al., 1990Down). Because of the similarity of these regions between viruses, any function that can be attributed to a conserved region in one virus can probably be extrapolated across the order. On this basis, there is information on which regions of RSV L are involved in polymerization, capping and methylation. Regions II and III probably comprise the polymerization domain, as they contain motifs that are highly conserved in all RNA-dependent RNA polymerases (Poch et al., 1989Down). The structure of other polymerases has been likened to a right hand, in which the thumb subdomain clamps down onto the RNA template and the palm and finger subdomains form a pocket into which incoming NTPs enter to be added to the growing chain. These subdomains also contain amino acids that are involved in metal-ion coordination, which is necessary for polymerization activity (O'Reilly & Kao, 1998Down). Although the structure of the polymerase domain has not been determined for any of the negative-sense RNA viruses, their conserved motifs have been modelled onto the known structure of a reverse transcriptase and indicate that the palm and finger subdomains are conserved (Müller et al., 1994Down).

Evidence that the RSV polymerase might mediate capping was first suggested by the finding that the virus mRNA cap structure is different from that of cellular mRNAs (Barik, 1993Down). A recent study utilizing an inhibitor that prevents capping of RSV mRNAs attributed guanylyltransferase activity to region V of the L protein and it was proposed that this region acts as a nucleotide-binding domain in guanylylation (Liuzzi et al., 2005Down). In silico analysis predicted that region VI possesses methyltransferase activity (Ferron et al., 2002Down) and this has now been borne out by functional studies on the VSV and SeV L proteins (Grdzelishvili et al., 2005Down; Li et al., 2005Down; Ogino et al., 2005Down).

The phosphoprotein is essential for polymerase activity (Mazumder & Barik, 1994Down; Barik et al., 1995Down; Grosfeld et al., 1995Down; Yu et al., 1995Down; Dupuy et al., 1999Down) and, by analogy with other mononegaviruses, might mediate contact between the L protein and the nucleocapsid template (Mellon & Emerson, 1978Down). A study on fragments of measles virus N and P proteins has shown that the two proteins have fast binding kinetics and weak binding affinity; these features probably allow rapid association and dissociation of the N–P proteins, which could facilitate L–P polymerase movement along the N–RNA template (Kingston et al., 2004Down). Although there are no similar structural data available for RSV N and P proteins, it is likely that functional features of these proteins are conserved among members of the Mononegavirales and that they interact in a similar manner (Karlin et al., 2003Down).

As its name suggests, the P protein is phosphorylated and there have been several studies to investigate whether there is a role for phosphorylation in coordinating polymerase activity (Barik et al., 1995Down; Sánchez-Seco et al., 1995Down). Studies using recombinant viruses or an intracellular plasmid-based assay in which the potential phosphorylation sites on P were abrogated indicated that phosphorylation is not essential for P protein function (Villanueva et al., 2000Down; Lu et al., 2002Down). However, a study using an in vitro RSV RNA-synthesis assay provided a different result. In this study, if phosphorylation at the major acceptor site (serine 232) was abolished, the polymerase was still able to initiate RNA synthesis at the 3' terminus of Le, but only produced short transcripts, predominantly 9 or 11 nt long (Dupuy et al., 1999Down). Given that the RSV promoter is likely to consist of Le nt 1–11 (see below), these are the correct length to be abortive transcripts, indicating that P phosphorylation might be necessary to allow the polymerase to navigate promoter escape. It is currently unclear why the intracellular and in vitro studies give apparently different results. One possibility is that cellular factors can complement phosphorylated P activity. Another possibility is that the promoter sequences used were slightly different, which could potentially affect the efficiency of promoter escape.

Although the L–P complex is sufficient to perform RNA synthesis in vitro and can be considered as the core polymerase, this complex is unlikely to function independently in nature. As described above, cellular proteins actin and profilin are necessary for optimal RNA-synthesis activity and could potentially be part of the polymerase complex (Burke et al., 1998Down, 2000Down). The viral M2-1 protein is required for processivity during transcription and is likely to be a component of the ‘transcriptase’ complex (Collins et al., 1996Down). The viral N protein is required to encapsidate the nascent RNA during replication and it is possible that the ‘replicase’ consists of a complex of L–P–N, as has been shown for VSV (Qanungo et al., 2004Down).

Transcription of mRNAs
Models for transcription initiation in the Mononegavirales
The Le sequence at the 3' end of the genome is highly complex, as it is important for both transcription and replication initiation. How it is able to direct two processes is not known and determining the site of transcription initiation is key to understanding this. It is known that almost all mRNA from the first gene of the RSV genome is initiated at the first GS signal at position 45 (Collins & Wertz, 1985Down). However, the mechanism by which the polymerase initiates at this site is poorly understood and remains controversial. There are two basic models for transcription initiation in the order Mononegavirales, based mainly on studies with SeV and VSV. Model 1 proposes that the polymerase first initiates RNA synthesis opposite the first nucleotide of the genome and transcribes the Le region. The polymerase then releases this transcript at, or near, the end of the Le region, locates the first GS signal and uses this to reinitiate RNA synthesis. Model 2 proposes that the polymerase initiates transcription directly at the GS signal. According to this model, there are two pools of polymerase, dedicated to either replication or transcription, that recognize distinct initiation sites. There are a number of studies that support model 1 (reviewed by Kolakofsky et al., 2004Down); for example, reverse-genetics analyses with SeV indicate that the nature of the promoter sequence can affect how efficiently the polymerase can initiate at the GS signal, which indicates that the two events are linked (Vulliémoz & Roux, 2002Down; Le Mercier et al., 2003Down; Vulliémoz et al., 2005Down). Model 2 is supported by studies with VSV that indicate that initiation at the first GS signal occurs independently of Le transcription (Chuang & Perrault, 1997Down; Whelan & Wertz, 2002Down). Arguments have been made against these conclusions (Kolakofsky et al., 2004Down), but compelling evidence for model 2 is derived from a study of VSV RNA synthesis in vitro using purified polymerase, which showed that two pools of polymerase complex could be identified that were associated with different proteins. In this study, the replicase and transcriptase complexes were found to initiate at the 3' end of Le or directly at the GS signal, respectively (Qanungo et al., 2004Down). Presumably, in this case, the transcriptase either binds directly at the GS signal or scans from the 3' end of the nucleocapsid to the GS signal without initiating RNA synthesis. Currently, it is difficult to reconcile the data supporting the two models. One possibility is that paramyxo- and rhabdoviruses use different initiation mechanisms and, in this respect, it is noteworthy that the organization of the RSV and VSV promoter regions appears to be different (Li & Pattnaik, 1999Down; Whelan & Wertz, 1999Down; McGivern et al., 2005Down). Alternatively, it is possible that both transcription-initiation mechanisms come into play during infection and so experimental evidence can be obtained to support both models. The cis-acting signals for RSV transcription initiation have been broadly mapped and, although they do not distinguish between these two models, they do provide information on the initial interactions of the polymerase with the template.

Initiation of transcription at the 3' end of the RSV genome
RSV transcription initiation involves three sequence elements: a region at the 3' end of the Le, a U-rich region at the end of the Le and the first GS sequence (indicated in bold type in Fig. 2aDown). As yet, there are no biochemical data regarding the functions of these sequences in RSV, but reverse-genetics analysis has suggested what roles they could play.


Figure 2
View larger version (25K):
[in this window]
[in a new window]
 
Fig. 2. cis-acting sequences involved in transcription initiation. (a) The RSV Le and GS sequence, with the regions required for transcription shown in bold type and the nucleotides not required shown in normal type. The GS signal for the first gene (NS1) is underlined. Substitutions at positions 4 and 12 (italicized) can inhibit transcription and augment replication, indicating that these are optimized for transcription. (b) Alignment of the 3' termini of the genomes of members of the Pneumovirinae: RSV A2 (hRSV), bovine RSV (bRSV), pneumonia virus of mice (PVM), human metapneumovirus (hMPV) and avian metapneumovirus (APV). GenBank accession numbers are AF035006, NC_001989, AY729016, NC_004148 and AY590688, respectively. The sequence is shown as negative-sense RNA and nucleotides identical to the RSV genome sequence are highlighted in bold type. Gaps, indicated by dashes, were introduced to maximize alignment. The GS sequences are underlined.

 
Saturation-mutation analysis of the first half of the RSV Le identified a number of nucleotides near the Le 3' terminus as being important for mRNA synthesis: substitutions at positions 3, 4, 5, 8, 9, 10 and 11 reduced mRNA production to <30 % of wild-type levels (Fearns et al., 2002Down; McGivern et al., 2005Down). Thus, regardless of the site of transcription initiation, mRNA synthesis from the GS signal is dependent on a sequence at the 3' end of the Le. There is evidence that this sequence is a polymerase-binding site. As described above, it was shown that the Le sequence must be at, or near, the 3' end of the template for RNA synthesis to occur (Collins et al., 1991Down; Cowton & Fearns, 2005Down). Further analysis to determine how much of Le needed to be proximal to the 3' terminus showed that only the first 11 nt of the Le were required. Therefore, this finding suggests that nt 1–11 contain a polymerase-binding site that functions cooperatively with the 3' terminus of the nucleocapsid to recruit transcriptase to the template (Cowton & Fearns, 2005Down).

The first GS signal (the NS1 GS sequence) is also integral to the sequential-transcription programme (Fig. 2aUp). If this sequence is ablated, a small amount of transcript corresponding to the first gene is produced, but it is initiated at the 3' terminus of the Le and downstream genes are transcribed at only approximately 10 % of wild-type levels (Kuo et al., 1996aDown). Thus, the GS signal is required for accurate and efficient transcription initiation. The sequence of the 3'-proximal GS signal is the same as most of the other RSV GS signals and this sequence responded similarly to saturation mutagenesis, irrespective of whether it was at the start of the first or second gene (Kuo et al., 1997Down). These data indicate that the 3'-proximal GS sequence is functionally equivalent to the internal GS signals, discussed in detail below.

Although the 3'-terminal 11 nt and GS sequence are sufficient to signal transcription initiation, alone they are inefficient. Efficient transcription depends on a region at the end of Le, immediately upstream of the GS signal (Fig. 2aUp; Fearns et al., 2000Down; McGivern et al., 2005Down). Although this has not been mapped precisely, it contains a stretch of uridylates that are conserved in the Le regions of all pneumoviruses sequenced to date (Fig. 2bUp), suggesting that the U-rich nature of this sequence is important. The function of this sequence is not known, but in VSV, it has been shown that GS-signal activity is dependent on the presence of an upstream U tract (Hinzman et al., 2002Down).

If the U-rich region and GS signal are moved closer to the element at the 3' end of Le, then transcription is inhibited significantly. The signals can still function if they are moved further apart; for example, a 60 nt insertion upstream of the U-rich region is tolerated, but insertions of 80 nt or more are inhibitory (Fearns et al., 2000Down; McGivern et al., 2005Down). These data suggest that there is some latitude in the spacing between the 3' element and the U-rich region and GS signal; however, they cannot function if they are brought closer together and they cannot be displaced too far.

These findings do not distinguish between the two models of transcription, although they do indicate that the transcriptase is recruited to the 3' terminus of the genome, rather than directly at the GS site. The polymerase presumably then moves to the GS signal to initiate mRNA synthesis. The possible mechanisms by which the polymerase could migrate from the 3' terminus to the GS signal are discussed in detail in a later section. Probably the signals cannot function well if they are closer together because polymerase pausing at the GS signal inhibits polymerase binding to the signal at the 3' end.

Initiation of mRNA synthesis at the GS signals
The GS signals direct the polymerase to initiate mRNA synthesis. These are highly conserved sequences, with only one, the L GS signal, differing from the sequence ‘CCCCGUUUAU’ (Collins et al., 1986Down, 1987Down). It is not clear exactly how the GS signals are able to direct RNA-synthesis initiation, as they do not act as promoters to recruit exogenous polymerase to the template. However, it is possible that the sequence can act similarly to a promoter for a polymerase that is already in a template-bound conformation, forming specific contacts that hold the polymerase in place for sufficient time to allow the initiating nucleotide to enter the polymerase active site and act as a substrate for RNA-synthesis initiation.

As described above, there is evidence that the virus L protein is responsible for capping the mRNA transcripts (Liuzzi et al., 2005Down). Uncapped transcripts are truncated after approximately 50 nt, suggesting that addition of the cap structure allows the polymerase to change into a stable elongation mode (Liuzzi et al., 2005Down). This is reminiscent of RNA polymerase II transcription, in which cap addition and elongation are linked, providing a checkpoint mechanism to ensure that only capped transcripts are extended (reviewed by Orphanides & Reinberg, 2002Down; Shilatifard, 2004Down; Zorio & Bentley, 2004Down). Studies with VSV and SeV indicate that both the capping and methylation reactions are dependent on interplay between the sequence encoded by the GS signal and the virus polymerase (Rose, 1975Down; Stillman & Whitt, 1999Down; Ogino et al., 2005Down). Saturation-mutagenesis analysis of the RSV GS sequence showed that it is highly stringent, with most substitutions causing a significant decrease in gene expression (Kuo et al., 1997Down). This rigidity is striking and may reflect the possibility that the RSV GS sequence is important for modifying the 5' end of the nascent RNA, in addition to initiation of mRNA synthesis.

Sequence analysis of transcripts from a mutant RSV GS signal revealed a surprising phenomenon termed ‘quasi-templating’. A GS signal containing a substitution at position 1 generated a mixed population of mRNA transcripts: two-thirds contained the exact complement of the mutant GS sequence, but one-third contained a wild-type ‘G’ residue at the first position (Kuo et al., 1997Down). Thus, it appears that the ‘correct’ nucleotide can be inserted at the first position of the transcript, even if this is contrary to what is being directed by the template. It should be noted that, although one-third of transcripts showed evidence of quasi-templating in this study, this might not represent one-third of initiations. The position 1 GS mutants only produced low levels of detectable full-length mRNA, indicating that the mutation hindered some aspect of mRNA synthesis. If the nucleotide at position 1 was important for capping of the mRNA, any transcripts containing the ‘correct’ non-templated nucleotide at this position would be more likely to be extended to the end of the gene than transcripts containing the mutant assignment. Thus, in this scenario, transcripts in which the first position was quasi-templated would be enriched, suggesting a deceptively high level of quasi-templating. Nonetheless, even if this caveat is taken into account, the evidence suggests that at least 3 % of initiations are quasi-templated, which is significantly higher than the expected error rate of an elongating polymerase (<=10–3). This raises the possibilities that the polymerase is either very error-prone during initiation of mRNA synthesis, that it preferentially uses a G residue as the initiating nucleotide or that it can initiate by using a prime–realign mechanism.

Elongation of the mRNA transcripts: the role of M2-1 in transcriptase processivity
Having initiated RNA synthesis, capped the mRNA and entered an elongation mode, the transcriptase remains subject to destabilization. It can terminate prematurely at multiple intragenic sites, presumably in response to sequences in the template and/or transcript that undermine the integrity of the elongation complex (Grosfeld et al., 1995Down; Collins et al., 1996Down; Fearns & Collins, 1999bDown). This is prevented from occurring by the M2-1 processivity factor. Although the main function of M2-1 is to prevent inappropriate intragenic termination, it can cause the polymerase to read through the GE sequences and this is presumably another manifestation of the same processivity function (Hardy & Wertz, 1998Down; Hardy et al., 1999Down; Sutherland et al., 2001Down). A direct comparison of the intragenic and GE anti-termination activities indicated that a higher concentration of M2-1 was required for processivity through a GE sequence, which is consistent with this view (Fearns & Collins, 1999bDown). Because transcription of RSV genes is dependent on complete and sequential transcription of each upstream gene, M2-1 activity is necessary not only to allow the polymerase to transcribe to the end of each gene, but also to access the promoter-distal regions of the genome. In addition, its ability to cause readthrough at the intergenic regions could alleviate the transcription gradient, as discussed previously by Hardy & Wertz (1998)Down and Whelan et al. (2004)Down. The M2-1 protein has a Cys3–His1 motif near its N terminus that has been proposed to interact with zinc, allowing appropriate folding of the M2-1 protein and binding to RNA (Hardy & Wertz, 2000Down; Cartee & Wertz, 2001Down; Tang et al., 2001Down; García-Barreno et al., 2005Down). In addition, M2-1 binds to P and this is necessary for M2-1 function (Mason et al., 2003Down). Early studies suggested an interaction between N and M2-1, but this has since been shown to be an artefact of their RNA-binding activities (Cartee & Wertz, 2001Down). Thus, it seems likely that M2-1 is a component of the transcriptase complex and interacts with the P moiety of the polymerase and either the nascent or template RNA. There are data indicating that M2-1 binds specifically to RNA sequence encoded by the Le region, suggesting that M2-1 might be ‘loaded’ onto the polymerase during transcription initiation (Cuesta et al., 2000Down). However, M2-1 activity is evident with minigenomes that have large substitutions in the Le region (Collins et al., 1991Down; McGivern et al., 2005Down). Therefore, this interaction either reflects specificity for the first 11 nt encoded by Le, which were conserved in these mutants, or a preference for binding to AU-rich RNA.

Transcription termination at the GE signals
At the end of each gene is a GE signal consisting of a conserved sequence followed by an uridylate tract, which signals the polymerase to polyadenylate and release the nascent mRNA. It has not been determined how the RSV GE signal controls these events; however, extensive functional analysis of the VSV GE signal has been performed and these findings provide a model for RSV termination. The VSV GE sequence is ‘AUACUUUUUUU’ and the AUAC sequence functions together with the U tract to cause the polymerase–transcript complex to undergo repetitive cycles of slippage relative to the template, whilst the polymerase active site is positioned on the U tract, resulting in polyadenylation of the mRNA transcript. It is currently unclear how release of the mRNA after a certain number of recurring adenylation cycles is controlled, but there is evidence to suggest that the poly(A) tail of the nascent mRNA is a factor in the release process, which might provide a mechanism to control poly(A) tail length (reviewed by Barr et al., 2002Down). In VSV, the GE signal is not the only important factor in termination, as there is evidence of an intimate link between 5' modification of the mRNA and correct polyadenylation and release, indicating that termination is a complex process (Rose et al., 1977Down; Whelan et al., 2000Down).

In contrast to the VSV GE signals, those of RSV are somewhat divergent and vary in their termination efficiencies (Kuo et al., 1997Down; Hardy et al., 1999Down; Tran et al., 2004Down). Most are not maximally efficient and, so, readthrough mRNAs representing two or more genes are found in infected cells (Collins & Wertz, 1983Down). The GE signal of RSV (A2 strain) consists of a UCAAU motif that is conserved in nine of the ten GE signals, a variable region of 3 or 4 nt, mostly consisting of adenylate and uridylate residues, and a tract of at least four uridylate residues. Each of these elements is important for GE function and the nucleotides immediately adjacent to the GE signal can also affect termination efficiency (Harmon et al., 2001Down; Sutherland et al., 2001Down; Harmon & Wertz, 2002Down; Cartee et al., 2003Down; Moudy et al., 2003Down). It is likely that the U tract of the GE signal is important for polyadenylation and that the A/U nature of the sequence contributes to polymerase instability (Whelan et al., 2004Down).

Two RSV L mutants that arose serendipitously have suggested a specific role for the polymerase in transcription termination. Each mutant contains a single amino acid substitution, one at aa 1049 and the other at aa 1169. The mutant polymerases appear to be functional in every respect, except that they demonstrate high levels of readthrough at the gene junctions and thus are somewhat defective in transcription termination (Juhasz et al., 1999Down; Cartee et al., 2003Down). This raises a question as to what these mutations are altering. One possibility is that the wild-type polymerase has evolved such that its processivity during transcription is not maximal, but is finely tuned to ensure adequate termination at the GE signals, and that the mutations alter polymerase structure such that its processivity is increased. However, the data for at least one of the mutants suggest that this is not the case. The 1049 mutant polymerase displays a dependence on M2-1 to synthesize complete mRNA transcripts similar to that of the wild-type polymerase, demonstrating that it is not more processive (Cartee et al., 2003Down). Thus, it appears that the mutation affects termination specifically at the GE signals and not termination per se, indicating that termination at the GE signals involves a specific site on the polymerase and is not simply due to instability between the polymerase active site, template and transcript.

Evidence for polymerase scanning from GE to GS
In other polymerases, transcription termination results in release of the polymerase complex from the transcript and template. However, in the order Mononegavirales, most of the polymerase that terminates at the GE signal releases the transcript without dissociating from the template. Because the polymerase has released the mRNA, its active site is free to reinitiate mRNA synthesis if it encounters a GS signal. At most RSV gene junctions, there is an intergenic region between the GE and GS signals and the polymerase is thought to scan this to locate the next GS signal.

The RSV intergenic regions vary from 1 to 56 nt in length and diverge between different virus strains (Collins et al., 1986Down; Johnson & Collins, 1988Down; Tolley et al., 1996Down). Other than the nucleotides immediately flanking the GE and GS sequences, there are no obvious sequence motifs in the intergenic regions except that they are highly U-rich, particularly at the end of the longer intergenic regions. Different intergenic regions could be exchanged for each other with no significant effect on transcription of the flanking genes (Kuo et al., 1996bDown) and variations in the intergenic regions of clinical isolates did not have an obvious effect on transcription of the downstream gene (Moudy et al., 2004Down). These data suggest that the intergenic regions do not fulfil an essential function in gene expression, although substitution of a wild-type intergenic region with an artificial G/C-rich sequence caused a 30 % reduction in expression of the downstream gene, demonstrating that certain sequences can be inhibitory to reinitiation (Kuo et al., 1996bDown).

The fact that the intergenic regions are variable in length indicates that polymerase can scan the template from GE to GS sequences and there is evidence to suggest that this scanning activity is highly efficient. For example, a downstream gene was still expressed even if the intergenic region was extended to 612 nt, although longer intergenic regions could not be tolerated (Bukreyev et al., 2000Down; A. Bermingham & P. Collins, personal communication). A study of the unusual M2/L gene junction, in which the GS signal for the (downstream) L gene lies upstream of the M2 GE signal, also provides evidence of polymerase scanning (Collins et al., 1987Down; Fearns & Collins, 1999aDown). The arrangement of this junction is shown in Fig. 3Down. Reverse-genetics analysis indicated that the polymerase transcribes M2 to the GE signal and then scans backwards to reinitiate at the L GS site. It was also shown that, having reached a GE signal, the polymerase could reinitiate RNA synthesis from either an upstream or a downstream GS signal. Thus, it appears that, having released mRNA at a GE signal, the polymerase can maintain a stable interaction with the template and scan in either direction to locate a GS signal.


Figure 3
View larger version (15K):
[in this window]
[in a new window]
 
Fig. 3. Diagram showing the arrangement of signals at the typical and M2/L gene junctions. The virus genes are depicted as grey rectangles and the GS and GE sequences are shown as white and black boxes, respectively. (a) Arrangement of a typical gene junction, in which the intergenic region can vary from 1 to 56 nt. (b) M2/L gene junction, in which there are 46 nt between the end of the L GS signal and the start of the M2 GE signal.

 
A number of cellular proteins are capable of scanning for a specific sequence within a nucleic acid backbone and these provide a model for how RSV polymerase scanning might occur. One of the best characterized is the Escherichia coli lac repressor protein. The lac repressor operator DNA-target site contains a sequence that allows formation of specific hydrogen bonds between the repressor and the DNA. To locate this sequence, the lac repressor diffuses along DNA by virtue of electrostatic interactions with the sugar–phosphate backbone. When it encounters its binding site, the protein undergoes a conformational change and hydrogen bonds are formed with the bases in the DNA, holding the repressor in place (von Hippel, 2004Down). In the case of RSV, a scanning model must take the encapsidated nature of the template into consideration. As described above, the nature of the P–N interactions may allow rapid movement of the polymerase along the nucleocapsid and, as the whole RNA genome is coated with N protein, it would be expected that there is no net loss or gain of energy as P–N interactions are broken and reformed (Kolakofsky et al., 2004Down). Thus, once the polymerase has released the mRNA of an upstream gene, it could diffuse along the template until it reaches the GS signal and forms specific contacts with the bases in this sequence. If this model is correct, the artificial, G/C-rich intergenic sequence that inhibited reinitiation at a downstream GS signal might have introduced hydrogen-bond donors or acceptors that mimicked those in the GS sequence and stalled the polymerase before it reached the authentic GS signal.

Having reinitiated mRNA synthesis, the polymerase can continue along the template, elongating to the end of a gene, polyadenylating and releasing the mRNA and then scanning and reinitiating mRNA synthesis at the start of the next gene, in sequence until it reaches the end of the template. As indicated above, this cycle of RNA synthesis is not completely efficient and the polymerase can dissociate from the template, resulting in a gradient of gene expression. In VSV, polymerase dissociation was found to occur mainly at, or near, the gene junctions (Iverson & Rose, 1981Down) and presumably occurs either due to dissociation during termination at the GE signals, failure to locate and initiate at the GS signal, failure to cap the mRNA or a combination of the above.

In summary, the data suggest that, during RSV transcription, the polymerase is recruited to the 3' end of the nucleocapsid, translocates to the first GS signal and initiates mRNA synthesis. From this point, RSV transcription is similar to transcription by cellular polymerases, with GS signals acting like promoters to template-bound polymerase, and polymerase processivity and termination being dependent on a combination of cis- and trans-acting factors. A difference between RSV and cellular transcription lies in termination, with a large proportion of the RSV polymerase maintaining a template-bound conformation and entering a scanning mode following mRNA release.

Genome replication
One fundamental difference between transcription and replication is that, during replication, the polymerase fails to respond to the GE signals, allowing it to synthesize a complete complement of the genome. The second significant difference is that the replication product is encapsidated as it is synthesized and it is likely that these two features are linked. Therefore, in considering replication initiation, it is important to consider not only how RNA synthesis is initiated, but also how the nascent RNA might become encapsidated.

cis-acting signals for antigenome initiation
The first 34 nt of the Le region contain all the sequences required for antigenome synthesis (indicated in bold type in Fig. 4aDown), setting RSV and other pneumoviruses apart from other members of the family Paramyxoviridae, which have bipartite replication promoters that extend into the first gene (Collins et al., 1991Down; Murphy et al., 1998Down; Tapparel et al., 1998Down; Murphy & Parks, 1999Down; McGivern et al., 2005Down). Saturation-mutagenesis analysis of the Le region showed that nt 3, 4, 5, 8, 9, 10 and 11, which are important for transcription, are also necessary for replication and, with the exception of one substitution at position 4, substitutions at these positions affected mRNA and antigenome synthesis similarly, suggesting that these nucleotides play a common role in both processes. In addition, mutations at positions 1, 2, 6 and 7 inhibited antigenome production significantly, although they had a relatively minor effect on mRNA synthesis, indicating that these nucleotides play a role specifically in replication (Fearns et al., 2002Down). By using primer-extension analysis to detect RNA transcripts initiated at the Le 3' terminus, it was determined that the first 13 nt of Le are sufficient to direct initiation of RNA synthesis (Cowton & Fearns, 2005Down; McGivern et al., 2005Down) and, similarly to the findings for transcription, there is evidence that replicase is recruited by the first 11 nt of Le in conjunction with the 3' end of the nucleocapsid. Together, these data indicate that Le nt 1–11 contain a binding site for the replicase and can function as a promoter to signal antigenome synthesis. Currently, it is not known whether nt 3, 4, 5, 8, 9, 10 and 11 are sufficient to signal replication initiation or whether nt 1, 2, 6 and 7 are also required.


Figure 4
View larger version (28K):
[in this window]
[in a new window]
 
Fig. 4. cis-acting sequences involved in replication initiation. (a) The RSV Le and GS sequence, with the regions required for replication shown in bold type and the nucleotides that are not required shown in normal type. Note that the nucleotides required for encapsidation have not been mapped precisely. The GS signal is underlined. (b) Alignment of the 3' termini of the antigenome RNAs of members of the Pneumovirinae. The sequences are shown aligned to the 3' terminus of the RSV genome, with identical nucleotides in bold type. Spaces introduced to maximize alignment are indicated by dashes. The GS sequence in the RSV genome is underlined. Abbreviations and GenBank accession numbers are the same as for Fig. 2Up.

 
Although the 3' terminus of the nucleocapsid is clearly important for polymerase recruitment to the template, it apparently does not play a major role in determining the antigenome start site. When antigenome transcripts from minigenomes containing short 3'-terminal extensions were sequenced, it was found that they were initiated predominantly opposite the first nucleotide of the Le sequence, rather than opposite the first nucleotide of the template (Cowton & Fearns, 2005Down). Thus, it appears that, although the promoter sequence must be close to the 3' end of the nucleocapsid to recruit polymerase to the template, the promoter sequence alone is sufficient to direct initiation opposite the first nucleotide of Le, indicating that interaction with this sequence positions the polymerase active site appropriately for antigenome initiation.

Encapsidation of antigenome RNA
Studies with VSV and SeV showed that the virus N protein is required in trans for virus RNA replication, probably due to its role in encapsidation (Patton et al., 1984Down; Horikami et al., 1992Down). Concurrent encapsidation increases polymerase processivity during SeV replication and it is likely that increased processivity allows the polymerase to override the GE termination signals to generate a full-length antigenome (Vidal & Kolakofsky, 1989Down; Gubbay et al., 2001Down). The mechanism by which RSV encapsidation occurs is not completely understood, but parts of the jigsaw are now beginning to come together.

If expressed in isolation, the RSV N protein binds non-specific RNA to form nucleocapsid-like structures, demonstrating that N–N interactions are sufficient to drive nucleocapsid assembly (Bhella et al., 2002Down; Murphy et al., 2003Down). In infected cells, most soluble N protein is found in a complex with P, which is thought to prevent N binding to non-specific RNA (Cuesta et al., 2000Down; Murphy et al., 2003Down; Castagné et al., 2004Down). Consistent with this hypothesis, there is evidence that P interaction with N interferes with N–RNA binding directly (García-Barreno et al., 1996Down; Khattar et al., 2000Down; Murphy et al., 2003Down). Thus, presumably encapsidation involves a conformational change in the N–P complex that allows N to be released from P, exposing its RNA-binding site.

There is evidence that a cis-acting signal directs initiation of encapsidation. Initiation products generated from minigenomes with mutations in the central part of Le were not encapsidated as efficiently as those generated from minigenomes containing wild-type sequence. In addition, although nucleotides at the 3' end of Le were found to be sufficient to direct efficient initiation of antigenome synthesis, production of full-length antigenome also required the central region of Le (Fig. 4Up; McGivern et al., 2005Down). These data suggest that the central part of Le contains sequences that facilitate antigenome encapsidation, which in turn enables the polymerase to be processive through the gene junctions to the end of the template. The sequence required for encapsidation and processivity has not been mapped precisely and could include nucleotides at the 3' end of Le, including nt 1, 2, 6 and 7, which are more important for replication than transcription.

This encapsidation signal could function in various ways. One possibility is that the sequence encodes a signal at the 5' end of the antigenome RNA, which recruits N–P complexes and causes N to be delivered to the nascent RNA chain. This could then act as a nucleation site to which further N molecules are delivered. Alternatively, the encapsidation signal could function in the context of the template; for example, it might recruit factors that facilitate encapsidation or cause the polymerase to adopt a conformation that allows it to bind N (or N–P), which results in delivery of N to the nascent RNA as it is being extruded.

Initiation of genome RNA synthesis
The TrC promoter region at the 3' end of the antigenome is 155 nt in length (Mink et al., 1991Down). This promoter directs a higher level of replication than the Le, resulting in an imbalance between antigenome and genome in infected cells. By analogy with Rabies virus, this bias might be important to promote incorporation of nucleocapsids of genome polarity into particles during virus assembly (Finke & Conzelmann, 1997Down). The TrC promoter region has not been mapped as well as the Le, but some information is available, due to the similarities between the two sequences. As shown in Fig. 4(b)Up, the RSV TrC and Le are highly conserved for the first 26 nt, but differ thereafter. As the high level of conservation would suggest, the 3' ends of the Le and TrC regions appear to be functionally similar: a direct comparison of the first 34 nt of Le with the first 36 nt of TrC indicated that these promoter regions have similar strengths and activities, although the TrC sequence directed a slightly higher level of replication, indicating that it is optimized for this process, whereas the Le is not (Fearns et al., 2000Down). Single nucleotide substitutions in positions 1–7 of TrC behaved similarly to the corresponding substitutions in Le, suggesting that the remainder of the Le and TrC sequence does not alter how this 3'-terminal sequence functions (Peeples & Collins, 2000Down). However, there is evidence that positions 4 and 12 are optimized for transcription in the Le promoter and for replication in the TrC promoter (Fearns et al., 2002Down). These differences in replication efficiency could reflect differences in replicase binding, initiation or encapsidation.

The reason why the TrC promoter region directs a significantly higher level of replication than the Le is primarily due to sequence lying between TrC nt 36 and 155 (Fearns et al., 2000Down). This observation was originally made by using a minigenome and has been confirmed in a more authentic setting by introducing deletions into the 5' Tr region of an infectious clone of RSV such that the antigenomic promoter was shortened to the 3' 36 or 57 nt. The mutant viruses were able to replicate, but generated lower levels of genome RNA than wild-type virus, confirming that the first 36 nt of the TrC promoter region are sufficient to direct replication initiation and encapsidation, but that the remainder of the TrC region enhances replication (R. Fearns & P. Collins, unpublished data). The mechanism by which this enhancing sequence functions is currently not known.

How do the data for RSV RNA synthesis integrate into the models for transcription and replication initiation?
As described above, there are two models for replication and transcription initiation, one in which both processes initiate at the 3' end of the Le (model 1) and one in which the transcriptase and replicase initiate at different sites (model 2). Although the data available for RSV do not confirm either model, they do provide information that allows these models to be fine-tuned.

Based on the data described, the first 11 nt of Le appear to contain both the replicase- and transcriptase-binding sites. There is evidence that these nucleotides also act as the replication promoter and are sufficient to direct initiation opposite nt 1. This suggests that, when the polymerase is recruited to the template, its active site is positioned over nt 1. Thus, to initiate mRNA synthesis at the GS site, the polymerase must move relative to the template in order to position its active site over the GS sequence at nt 45. Evidence that the transcriptase can do this was derived from experiments suggesting that polymerase recruited to nt 1–11 at the 3' end of the template can scan to access internal sequences (Cowton & Fearns, 2005Down). If this scenario is correct, then the polymerase must break its bonds with its binding site at the 3' end of the nucleocapsid and it would be expected that this would require energy.

The contacts between a polymerase and its binding site are typically broken once the polymerase has initiated RNA synthesis. Thus, model 1, in which transcription is initiated at position 1 of the Le, provides a simple explanation of how the polymerase accesses the GS signal (Fig. 5aDown). As discussed previously by Kolakofsky et al. (2004)Down, release of the Le transcript near the end of Le could occur as a consequence of failure to modify the 5' terminus of the RNA. If capping was dependent on the sequence encoded by the GS signal, the RSV Le transcript would lack the sequence motif required, causing the polymerase to release the transcript after approximately 50 nt. The polymerase active site would then be unoccupied, allowing reinitiation at the GS signal. For replication to occur, the nascent RNA must become encapsidated before the polymerase reaches the 50 nt threshold. The data discussed here indicate that this could occur, as all of the sequences required for encapsidation of the RNA are located within the first 34 nt of the Le region. If encapsidation was initiated, then the processivity of the polymerase would be increased, allowing it to continue beyond 50 nt, read through the GE signals and produce an antigenome (Fig. 5cDown). The data regarding the role of P phosphorylation in promoter escape are consistent with this model: in transcription reactions in which a non-phosphorylated P protein was used, short transcripts of approximately 11 nt were synthesized, but there was no evidence for initiation of mRNA synthesis at the GS signal (Dupuy et al., 1999Down). These data suggest that the polymerase initiates and elongates transcripts from the promoter at the 3' end of Le before initiating mRNA synthesis at the first GS signal.


Figure 5
View larger version (17K):
[in this window]
[in a new window]
 
Fig. 5. Diagram depicting the models for transcription and replication initiation from the Le promoter region. The Le region is shown as a thick black line and the polymerase-binding sequence (nt 1–11) is hatched. The NS1 gene sequence is in dark grey and its GS signal is represented by a white box ({square}). The polymerase active site is represented by an oval and the 5' cap on the mRNA is shown as a filled circle (bullet). N protein coating the antigenome is shown in light grey. (a, b) Transcription models 1 and 2, respectively. (c) A model for replication.

 
If RSV transcription is initiated directly at the GS signal, as in model 2, then the polymerase needs to break contact with nt 1–11 without initiating RNA synthesis. It is possible that it is able to hydrolyse ATP to do this (Fig. 5bUp). Following release of the polymerase from its binding site, it would be able to scan along the template by using the same mechanism as when scanning at the gene junctions. When it reached the GS signal, it would be directed to initiate RNA synthesis. According to this model, because replication is initiated at position 1 of the Le, it would not require an ATPase activity. In the case of VSV, it has been shown that a higher concentration of ATP is required for transcription than for replication initiation, which is consistent with this model (Testa & Banerjee, 1979Down; Perrault & McLear, 1984Down; Beckes et al., 1987Down). Similarly to model 1, because the signals for encapsidation are within the Le region, the replication product could become encapsidated, whereas the transcription product would not be and any replication products that failed to become encapsidated would be aborted, due to poor processivity. In both of these models, the U-rich region at the end of Le might stall the polymerase to allow it to recognize the GS signal efficiently.

Regulation of transcription and replication
The processivity factors: M2-1 and N–P
As described above, M2-1 and N–P increase polymerase processivity and can cause the polymerase to read through the GE signals. Thus, these proteins were likely candidates to regulate the polymerase between its transcription and replication activities. However, this does not appear to be the case. Increasing the level of intracellular M2-1 has no detectable effect on antigenome or genome synthesis or on the levels of transcription initiation (Collins et al., 1996Down; Hardy & Wertz, 1998Down; Fearns & Collins, 1999bDown), demonstrating that this protein is not involved in modulating the polymerase between its dual activities. Increasing the intracellular concentration of N and P results in an increase in replication, but does not inhibit transcription (Fearns et al., 1997Down). At face value, this suggests that mRNA and antigenome syntheses occur independently, consistent with model 2 described above. However, if model 1 is correct, it is possible that only a fraction of initiations at the 3' end of the Le result in successful reinitiation at the GS signal or encapsidation. Thus, an increase in encapsidation, and the consequent increase in antigenome synthesis, might not have any detectable impact on transcription.

The M2-2 protein
One protein that might regulate RSV polymerase activity is the virus M2-2 protein. M2-2 has a potent negative effect on RSV RNA synthesis and its expression is moderated in a natural virus infection (Collins et al., 1996Down; Hardy & Wertz, 1998Down; Bermingham & Collins, 1999Down; Ahmadian et al., 2000Down; Cheng et al., 2005Down). If M2-2 expression is ablated in recombinant RSV, the relative levels of RSV replicative and mRNAs are altered (Bermingham & Collins, 1999Down; Jin et al., 2000Down), suggesting that M2-2 promotes antigenome and genome accumulation, but inhibits mRNA synthesis. It should be noted that, although M2-2 might modulate the polymerase between its different activities, it cannot be the only factor that does this, as the mutant virus produces both replication and transcription products. The mutant virus mediated enhanced cell–cell fusion compared with its wild-type counterpart, which could reflect a defect in virus packaging and release. In addition, M2-2 was found to augment release of infectious virions in a reconstituted packaging assay (Teng & Collins, 1998Down). These data suggest that the M2-2 protein could regulate nucleocapsid RNA synthesis in preparation for virus assembly.

Concluding remarks
In this review, we have described the elegant and economical strategy that RSV uses to transcribe and replicate its genome. Transcription and replication are initiated by using overlapping, cis-acting sequences, the transcription strategy represents a simple means of generating multiple mRNA without the necessity for numerous promoters and replication is modulated such that synthesis of genome-sense RNA predominates. As we have attempted to convey, the paradigms that have been established with other RNA polymerases provide a useful platform for interpreting the data concerning RSV RNA synthesis and give hints as to the molecular details that are involved. However, in contrast to other organisms, RNA synthesis by RSV and other members of the order Mononegavirales has evolved to entail numerous signalling interactions between the polymerase complex, associated proteins, template and transcript, which allow the production of different RNAs.

Our understanding of the way in which the RSV genome is expressed and replicated is already leading to potential benefits in controlling the virus. Knowledge of the roles of the Le, Tr and gene-junction sequences and the identification of the polymerase and its associated proteins led to the creation of the infectious clone of RSV, which has allowed rational design of live-attenuated vaccine candidates (Collins et al., 1999Down; Collins & Murphy, 2002Down). The unique nature of the