Sars-CoV2 can integrate into genome thanks to LINE1

da | Giu 15, 2022 | Biologia Molecolare

Figure 1:  the flanking sequences included a 20-bp direct repeat, this target site duplication is a signature of LINE1-mediated retro-integration. Full-length SARS-CoV-2 NC subgenomic RNA sequence is shown in red and human genomic sequences in black. Features indicative of LINE1-mediated “target-primed reverse transcription” include the target site duplication (light green) and the LINE1 endonuclease recognition sequence (TTCT).

Abstract

In the article “Reverse-transcribed SARS-CoV-2 RNA can integrate into the genome of cultured human cells and can be expressed in patient-derived tissues” by Liguo Zhang et al. [1], they investigated the possibility that subgenomic RNAs of SARS-CoV-2 could be transcribed and integrated into the DNA of cultured human cells. This mechanism appears to be mediated by the retrotransposable LINE1 elements.

It has been found that in some subject the positivity for SARS-CoV2 on PCR tests was extremely prolonged, after the first COVID19 infection a new positivity was reported even though viral particles were not isolated from these subjects.

The authors suggest that the transcription of the integrated sequences could explain some of the positive PCR assays observed in patients.

Discussion

Reinfection with SARS-CoV-2 after recovery has been reported in cohort-based studies with subjects held in strict quarantine after they recovered from COVID-19 suggesting that at least some “re-positive” cases were not caused by reinfection. SARS-CoV-2 is a positive-stranded RNA virus (group IV in Baltimore’s classification), like other beta-coronaviruses, SARS-CoV-2 employs an RNA-dependent RNA polymerase to replicate its genomic RNA and transcribe subgenomic RNAs. Is a lytic cycle virus that requires the synthesis of a negative strand used as a mold by RNA polymerase [2]. A recent study by Liguo Zhang et al. try to understand if it is possible that viral RNA can integrate into the human genome thanks to reverse transcriptase and subsequently transcribed, causing a positive result in the PCR test without however producing viral particles, according to this article this mechanism is mediated by LINE1.

Human LINE1 elements (∼17% of the human genome) are a type of autonomous retrotransposons which are able to retrotranspose themselves and other nonautonomous elements such as Alu, moreover they are a source of cellular endogenous RT. Endogenous LINE1 elements have been shown to be expressed in aged human tissues and they are commonly up-regulated upon viral infection, including SARS-CoV-2 infection [3]. In general, however, these elements are kept inactive except in specific cases such as stem cells.

TPRT (target-primed reverse transcription) is the name of the mechanism by which LINEs insert themselves into DNA. It consists of several steps: transcription, translation, binding to the LINE messenger RNA and formation of a binding to the target DNA. Next comes the cutting of the target site and formation of the DNA-RNA hybrid, synthesis of the first cDNA strand, degradation of the RNA and synthesis of the second strand, and finally DNA welding and repair leading to the formation of the new LINE DNA. Only 1 in 100 processes goes to completion [4].

Liguo et al. transfected HEK293T cells with LINE1 expression plasmids to increase integration events prior to infection with SARS-CoV-2, then they isolated DNA from the cells 2 days after infection detecting  DNA copies of SARS-CoV-2 nucleocapsid (NC) sequences in the infected cells by PCR. The viral DNA sequence (NC) was confirmed by Sanger sequencing.

These results suggest that SARS-CoV-2 RNA can be reverse-transcribed, at least when LINE1 are overexpressed

To demonstrate directly that the SARS-CoV-2 sequences were integrated into the host cell genome, DNA isolated from infected LINE1-overexpressing HEK293T cells was used for Nanopore long-read sequencing. In figure 1B of the article there is an example of a full-length viral NC subgenomic RNA sequence (1,662 bp) integrated into the cell chromosome X and flanked on both sides by host DNA sequences. These results indicate that SARS-CoV-2 sequences can be integrated into the genomes of cultured human cells by a LINE1-mediated retroposition mechanism.

DNA copies of portions of the viral genome were found in almost all human chromosomes.

Importantly, about 67% of the flanking human sequences included either a consensus or a variant LINE1 endonuclease recognition sequence.

These LINE1 recognition sequences were either at the chimeric junctions that were directly linked to the 3′ end (poly-A tail) of viral sequences, or within a distance of 8–27 bp from the junctions that were linked to the 5′ end of viral sequences, which is within the potential target site duplication.

About 71% of the viral sequences were flanked by intron or intergenic cellular sequences and 29% by exons: because human genome is constituted by 1.1% of exons, this suggests a preferential integration into exon-associated target sites.

With Illumina paired-end whole-genome sequencing they obtained similar results.

In particular viral–cellular boundaries were frequently close to the 5′or 3′ UTRs of the cellular genes, suggesting that there is a preference for integration close to promoters or poly(A) sites, therefore it is possible that these integrated sequences can be transcribed.

To assess whether genomic integration of SARS-CoV-2 sequences could also occur in infected cells that did not overexpress RT, they isolated DNA from virus-infected HEK293T and Calu3 cells that were not transfected with an RT expression plasmid detecting a total of seven SARS-CoV-2 sequences fused to cellular sequences, all of which showed LINE1 recognition sequences close to the human–SARS-CoV-2 sequence junctions.

To investigate the possibility that SARS-CoV-2 sequences integrated into the genome can be expressed, they analysed published RNA-seq data from SARS-CoV-2–infected cells for evidence of chimeric transcripts. Examination of these datasets revealed a number of human–viral chimeric reads that occurred in multiple sample types, including cultured cells and organoids from lung/heart/ brain/stomach tissues. The abundance of the chimeric reads positively correlated with viral RNA level across the sample types. Chimeric reads generally accounted for 0.004–0.14% of the total SARS-CoV-2 reads in the samples. Most of the chimeric junctions mapped to the sequence of the SARS-CoV-2 NC gene. This is consistent with the finding that NC RNA is the most abundant SARS-CoV-2 subgenomic RNA, making it the most likely target for reverse transcription and integration.

They reasoned that the orientation of an integrated DNA copy of SARS-CoV-2 RNA should be random with respect to the orientation of the targeted host gene, predicting that about half the viral DNAs that were integrated into an expressed host gene should be in an orientation opposite to the direction of the host cell gene’s transcription. As predicted, about 50% of viral integrants in human genes were in the opposite orientation relative to the host gene.

It’s important to note that almost all the transcripts of acutely infected cells have a positive orientation (figure 3C-3D of the article): this is natural since the RNA transcripts have this orientation. However, the retrotransposition of non-human gene fragments involves the insertion of the foreign gene segment in a non-preferential way, therefore the orientation of the new insert can be positive or negative in a totally random manner and consequently also the chimeric transcripts have a mixed orientation, in fact the abundance of chimeric transcripts of negative orientation is about 50% in some samples, this means that in cells with 50% negative chimeric transcripts there will have been an integration (Figure 2)

Figure 2: examples of integrated viral genome in both orientation (red) into the human genome (light blue).

These data fit with a mechanism of reverse transcription and back-position integration and suggest that endogenous LINE1 RT may be involved in the reverse transcription and integration of SARS-CoV-2 sequences into the genomes of infected cells.

Conclusions

In conclusion, we can affirm that this article is thought-provoking because it opens up possible future studies regarding the integration of exogenous genomes into human DNA, and above all, it makes think about the dynamic nature of the human genome that is often considered to be not subject to mutations of this kind that could occur with greater frequency than hitherto assumed (as the article reports, chimeric reads generally accounted for 0.004-0.14% of the total SARS-CoV-2 reads in the samples, meaning that also the transcriptional rate is quite impressive). In a controlled environment, the study demonstrates beyond doubt that these integrations can occur despite the fact that the TPRT process is inefficient and LINE1s are often inactive.

Our opinion is that the present paper has some shortcomings: first, when they carried out experiments with cells transfected by the plasmid containing LINE1, they looked only for viral genome integration events, and not checked the integration of endogenous RNAs. It would be interesting to compare the rate of viral insertions with respect to the total percentage of integrations that occurred.

The authors state that in the absence of the plasmid seven integrations occurred but do not specify on how many cells so it is not known to us how much integration effectively occurred.

Finally, the data they analysed derived from RNA-seq are taken from previous publications and in our thinking turn out to be fewer in number than necessary for a good analysis.

According to the authors it will be important in future to demonstrate the presence of SARS-CoV-2 sequences integrated into the host genome in patient tissues; but as they say it will be extremely difficult, this is because only a small fraction of cells in patient tissues are expected to be positive for viral sequences. In addition, only a fraction of patients may carry SARS-CoV-2 sequences integrated into the DNA of some cells.

The relevance of integrations, in particular with respect to PCR test positiveness, as to be considered also taking into account that occur on somatic cells, which can be eliminated by the immune system and can undergo generational turnover resulting in the loss of the integration.

Therefore, despite the scientific evidence retrieved from in vitro studies, LINE1 activation is not, in our opinion, a clinical problem in vivo.

References

  1. Liguo Zhang et al. “Reverse-transcribed SARS-CoV-2 RNA can integrate into the genome of cultured human cells and can be expressed in patient-derived tissues”. PNAS Vol. 118 No. 21 e2105968118 (2021).
  2. Thiel V. “Coronaviruses: Molecular and Cellular Biology”.  Caister Academic Press. (2008)
  3. R. B. Jones et al., LINE-1 retrotransposable element DNA accumulates in HIV-1-infected cells. J. Virol. 87, 13307–13320 (2013).
  4. Strachan T., Goodship J., Chinnery P. “Genetica & genomica nelle scienze mediche”. (2016)

Gregorio Scancarello

Master Industrial Biotechnology student

Alessio Verderio

Master Industrial Biotechnology student