Overblog
Edit post Follow this blog Administration + Create my blog
katherukubai-blog-com.over-blog.com

Article and Academic Writings

Mitochondrial genome

1.1.2 Mitochondrial genome:

 

At the molecular level, mtDNA is the only source of critical cellular proteins outside of the eukaryotic nucleus. In the majority of eukaryotes, mtDNA is organizsed as a circular, double-stranded DNA molecule (Andrews et al 1999). In 1981 Anderson and colleagues (Anderson et al. 1981) published the sequence and organization of the human mitochondrial genome. This was the first mitochondrial genome had ever been sequenced and was 16569 base pairs (bp) long. In 1999 Andrews and colleagues (Andrews et al. 1999) re-sequenced the original placental mtDNA sample used by Anderson’s group revealing a number of errors, due both to sequencing artefacts and to the use of bovine samples instead of human mtDNA to cover the regions technically difficult to sequence in the human sample. The revised reference sequence was called rCRS. The two strands of the mtDNA are distinguished by their nucleotide composition: heavy (H-strand) is guanine rich, compared with the cytosine-rich light strand (L-strand). The length varies between species (15 000–17 000 bp), but is fairly consistent in humans (16 569 bp) (Andrews et al 1999). Human mtDNA divided into two regions; the first is the coding region, which contains the genes responsible for most functions of the oxidative phosphorylation pathway. This section begins with base 577 and ends at base 16,023. The second region comprises the control region that contains the origin of replication and two hyper-variable segments I (HVS-I) and II (HVS-II), which encompass bases 16,023-16,365 and 73-340, respectively (Figure Intro: 02) (Brandstätter et al. 2004). These hyper-variable regions contain the origin of heavy-strand mtDNA replication (OH), the light-strand transcription promoter (LSP) and the heavystrand promoters (HSP1 and HSP2) (Figure intro: 03) (Falkenberg et al. 2007). It seems they are involved in the genome replication and transcription nevertheless their function is not completely known.

Figure intro-02 Control region in the mitochondrial DNA (Jobling et al., 2013).

 

 

Unlike its nuclear DNA (nDNA) counterpart, mtDNA is extremely efficient with ~93% representing a coding region. Unlike nDNA, mtDNA genes lack intronic regions. Most genes are contiguous, separated by one or two non-coding base pairs. mtDNA contains only one significant non-coding region, the displacement loop (D-loop) (Andrews et al 1999). In the codon region the mtDNA contains 37 genes, 28 on the H-strand and 9 on the L-strand (Figure Intro: 03). Thirteen of the genes encode one polypeptide component of the mitochondrial respiratory chain (RC), the site of cellular energy production through OX-PHOS. Thirteen of these genes provide instructions for making enzymes involved in the output of ATP through oxidative phosphorylation (OX-PHOS): ND1-ND6, and ND4L encode seven subunits of complex I (NADH-ubiquinone oxidoreductase); cyt b is a complex III subunit gene (ubiquinolcytochrome c oxidase reductase); COI-COIII encode for three of the complex IV (cytochrome c oxidase, or COX) subunits, and the ATP 6 and ATP 8 genes encode for two subunits of complex V (ATP synthase). The remaining genes provide instructions for two ribosomal RNAs (rRNAs 12S and 16S) and 22 transfer RNAs (tRNAs) required and sufficient for the synthesis of mitochondrial proteins (Anderson et al. 1981; DiMauro and Schon 2003; Wallace 1994).

 

Figure intro-03 Schematic representation ofmammalianmtDNA. Thedouble- strandedcircularmammalianmtDNAmoleculeof~ 16.5kbcontainsasinglelongernoncodingregion,thedisplacementloop(Dloop)region,harboring the promoters for transcription of both mtDNA strands (HSP andLSP) and the origin of leading strand replication (OH). The origin of lagging strand replication (OL) is embedded in a cluster of tRNA genes. The genesfor the two rRNAs (12S and 16S rRNA), 13 mRNAs (ND1–6, ND4L, Cyt b,COI–III, ATP6, and ATP8), and 22 tRNAs (F, V, L1, I, M, W, D, K, G, R, H,S1, L2, T, P, E, S2, Y, C, N, A, and Q) are indicated by boxes.

 

Mitochondria are dependent upon the nuclear genome for the majority of the OXPHOS system and also for maintaining and replicating mtDNA as well as organelle network proliferation and destruction. To date, 92 structural OXPHOS subunit genes have been identified (Chinnery and Hudson 2013): 13 encoded by mtDNA and 79 encoded by the nuclear genome (Hirst 2011, Angerer et al 2011, Smith et al 2015, Vogel et al 2007, Mick et al 2011, Wang et al 2001).

 

1.1.2.1 Nuclear Mitochondrial DNA segment (NUMTs):

 

NUMT is coined by molecular evolutionary biologist, Jose V. Lopez, which describes a transposition of any type of cytoplasmic mitochondrial DNA into the nuclear genome of the eukaryotic organisms (Lopez et al 1994). More and more NUMT sequences, with different size and length, in the diverse number of Eukaryotes, have been detected as more whole genome sequencing of different organisms accumulate (Bravi et al 2006). However, NUMTs differ in number and size across different species (Mishmar et al 2004, Sacerdot et al 2008, Schizas 2012). Scientists have used NUMTs as the genetic markers to figure out the relative rate of nuclear and mitochondrial mutation and recreating the evolutionary tree (Bensasson et al 2001).

For the human mitochondrial tree, the Numts could be useful for use as outgroups since they have been present in the slowly evolving nuclear genome for possibly million of years, but the most important issue with numts is the analytical problems they might cause due to instability of Numts. Numts could compete with the actual mtDNA and provide erroneous sequences of mtDNA that could lead to wrong interpretations in population genetics, phylogeography and forensic genetics. In human genome, the latest number of NUMT recorded is 755 fragments which range from 39 bp to almost the entire mitochondrial sequence in size (Dayama et al 2014). There are 33 paralogous sequences with over 80% sequence similarity and of a greater length than 500 bp (Ramos et al 2011). Moreover, not all the NUMT fragments in the genome are the result of mtDNA migration; some are the outcome of amplification after insertion. Old NUMTs are found to be more abundant in the human genome than the recent integrants, indicating that mtDNA can be amplified once inserted (Dayama et al 2014).

 

1.1.2.2 Replication:

 

Mitochondrial DNA is continuously replicated by DNA polymerase γ independent from the cell cycle, even in non-dividing tissues such as skeletal muscle and brain (Birky 2001; Bogenhagen and Clayton 1977). The fortuitously slow rate of mtDNA replication has facilitated the isolation and characterization of in vivo replicative intermediates but the precise mechanism of mtDNA replication is currently a topic of great debate. According to the traditional model (strand-asymmetrical) which was proposed by Clayton in 1982, mammalian mtDNA molecules replicate unidirectionally from two spatially and temporally distinct, strand-specific origins (figure intro: 04) (Brown et al 2005). The mtDNA replication starting in the H-strand at the OH site (origin of replication for the H strand) with a cleavage of a primary transcript synthesised from the L-strand promoter that will act as a primer. The H-strand replication frequently stops 700 bp downstream the OH site in the termination-associated sequences or TAS, creating a triple-stranded structure called the displacement loop or simply D-loop. Replication of the H-strand continues clockwise until exposing the origin of the light-strand replication (OL) (figure intro: 04). From this point, the replication of the L strand starts in a counter-clockwise direction. The newly synthesised strands are then ligated to form closed circles (Clayton 1982). This mode of replication has probable implications in generating a strand-specific mutational bias. The mutational damage is probably higher in the H-strand since it spends more time single-stranded (Bielawski et al 2002).

Another experimental evidence suggests an alternative strand symmetric-model or ―rolling cycle ‖ model (figure intro: 04). It hypothesizes that replication starts at several points in a 5.5 kb region between the control region and the ND4 gene (referred as ORI) (Bowmaker et al. 2003). The replication then proceeds in both directions, stopping at OH and standing briefly at the OL before completing the cycle. The replication cycle finishes with the lagging strand catching up with the connection of Okazaki fragments (Yasukawa et al 2005). Pedro

The mechanism that regulates replication is pretty much unknown. It is possible that mtDNA molecules aggregate through the D-loop associated with specific proteins, this way regulating replication, possibly inhibiting it or restricting replication to particular molecules (Hol et al 2007).

 

 

Figure intro-04: Models of mammalian mtDNA replication

 

 

 

 

 

1.1.2.3 Transcription:

 

Transcription of mtDNA is ‘prokaryotic like’ and in Human mtDNA has three promoters for the initiation of transcription; HSP1, HSP2, and LSP (heavy strand 1, heavy strand 2, and light strand promoters). Transcription of the heavy strand is initiated either from the heavy-strand promoter 1 (HSP1), generating a short transcript that terminates at the 16S rRNA, or from HSP2, generating a polycistronic message including both rRNA genes, 12 mRNA genes, and 14 tRNA genes. Light-strand transcription from the light-strand promoter (LSP) generates the ND6 mRNA and 8 tRNAs (Taylor and Turnbull 2005). Full length transcripts are cut into functional tRNA, rRNA, and mRNA molecules. The primary mitochondrial transcription machinery consists of a mitochondrial RNA polymerase (POLRMT), a mitochondrial transcription factor A (TFAM), and one of two homologous mitochondrial transcription factors, B1 (TFB1M) or B2 (TFB2M) (Chinnery and Hudson 2013; Falkenberg et al. 2002). In spite of considerable knowledge concerning the systems involved, the regulation of mitochondrial transcription, its coupling to protein translation, and ribosome biogenesis are poorly defined, although new regulatory proteins continue to be identified (Wang et al. 2007b). The mitochondrial transcription termination factors are a group of DNA-binding proteins including MTERF (also called MTERF1) that are thought to be involved in the regulation of mitochondrial transcription, although their functions and mechanism of action remain to be defined. In this process MTERF3 is ubiquitously expressed and shown to target mitochondria as a transcriptional repressor through its binding to the mtDNA promoter region (Park et al. 2007).

 

1.1.2.4 Translation

 

The mitochondrial translation machinery works in strict cooperation with the cytoplasmatic translation machinery that makes nuclear-encoded proteins destined for the mitochondrion (Chinnery and Hudson 2013). Translation of the 13 mtDNA protein coding genes occurs in the mitochondria. The mito-ribosomes are partly coded by mtDNA (MTRNR1 and MTRNR2), but require a further 81 nDNA proteins. Nascent mtRNA is translated co-transcriptionally by mitochondrial ribosomes bound both to the polymerase (Kornberg A 1992) and the inner mitochondrial membrane (Liu and Spremulli 2000). The close proximity of the two sets of machinery on each side of the membranes ensures efficient assembly of mitochondrial complexes containing proteins encoded by nuclear and mitochondrial genomes (Iborra et al. 2004).

Translation is initiated by two mitochondrial initiation factors: mtIF1 and mtIF3 (Ma et al 1995, Koc and Spremulli 2002). mtIF3 begins initiation by dissociating the ‘mitoribosome’ (the mitochondrial ribosomes) allowing assembly of the initiation complex (Christian et al 2009). MRNA is then bound to the small subunit, aligning the start codon to the peptidyl site of the mitoribosome. Peptide elongation is controlled by a number of nuclear-encoded genes, including mitochondrial elongation factor Tu (mtEFTu) (Hammarsund et al 2001, Ling et al 1997), which binds the tRNA to the mitoribosome and mitochondrial elongation factor G1 (mtEFG1), required to move the newly added amino acid along one position and allowing amino acid inclusion (Smits et al 2010). Translation termination is carried out solely by mitochondrial release factor 1a (mtRF1a) (Zhang et al 1998), which recognizes the stop codons (UAA and UAG) (Soleimanpour-Lichaei et al 2007) and triggers hydrolysis of the bond between the terminal tRNA and the nascent peptide (Chinnery and Hudson 2013).

 

1.1.2.5 Mitochondrial Genetic code:

 

The genetic code is the set of rules by which information encoded in genetic material (DNA or RNA sequences) is translated into proteins (amino acid sequences) by living cells. Since 1966 when the code was established, it was defined ‘universal’ but in 1979, less than 15 years since the code was deciphered, it was found that in vertebrate’s mitochondria it differed from the universal code in some codons (Barrell et al. 1979). mtDNA uses only two stop codons: ‘AGA’ and ‘AGG’ (compared with ‘UAA’, ‘UGA’ and ‘UAG’ in nDNA) (Temperley et al 2010), conversely ‘UGA’ encodes tryptophan (figure intro: 05). To compensate UAA codons have to be introduced at the post-transcriptional level. In addition ‘AUA’, isoleucine in nDNA, encodes for methionine in mtDNA (Chinnery and Hudson 2013).

Figure intro: 05: Vertebrate mitochondrial genetic code. In red between parentheses is indicated the corresponding code for that codon in the nuclear genome.

 

 

1.2.6 The mitochondrial Eve.

 

The hypothetical woman at the root of all the humanhaplogroups (meaning just the mitochondrial DNA haplogroups) is the matrilinealmost recent common ancestor (MRCA) for all currently livinghumans. She is commonly called Mitochondrial Eve.The root of the human mitochondrial tree (Cann et al 1987) was popularized in the press with the name ―Mitochondrial Eve‖. This sequence represents a theoretical ancestor to all the remaining mitochondrial lineages in the human species. In popular texts it is commonly referred as the sequence of the ancestral woman from whom all of the human species is descended, but different parts of the genome have different histories and it is even possible that the women who carried this sequence did not leave any of their sequences to the nuclear genome of the current human species. The age of the mitochondrial Eve, or the coalescence time of the human mitochondrial tree, is nearly 200,000 years using the Mishmar coding-region rate (Mishmar et al 2003),186,000 ya with synonymous rate (synonymous transition rate of 1 per 7884 years by Soares et al. (2008)); the synonymous mutation per 6764 years reported by Kivisild et al. (2006) was too fast), and about 190,000 ya with whole mtDNA genome corrected for purifying selection (Soares et al., 2008).

The mitochondrial tree was traditionally rooted using the chimpanzee, bonobo and gorilla sequences (Arnason et al 1996, Hixso and Brown 1986, Xu and Arnason 1996), but recently the complete mitochondrial genome of the Neanderthal became available (Green et al 2008). The first split occurs between haplogroup L0 and all of the remaining haplogroups L1 – L6 (figure Intro: 06). The actual root has still not been completely defined, but the mitochondrial Eve status shown in several papers (Maca-Meyer et al 2001, Torroni et al 2006, Behar et al 2008) can now be revised taking into account the recently available Neanderthal sequence.

 

 

Figure intro: 06. Mitochondrial DNA phylogenetic trees.

 

 

1.1.3.4 Studies using Human mtDNA:

 

mtDNA studies have a deep impact on the scientific community. Mitochondrial pathologies are commonly reported in the literature. The main function of the mitochondria as the producers of energy in the cell leads to a number of variable phenotypes when mutations occur (Taylor et al 2005). The fact that mutations can be heteroplasmic with a variable percentage of the deleterious haplotype also generates an array of phenotypes for the same mutation (Chinnery 2006). Additionally, somatic mutations in mtDNA genetics are possibly involved in the propagation of cancer tissues and accumulation of somatic mutations on ageing (Taylor et al 2005). This makes mtDNA clinical genetics as current an issue as ever. Finally, mtDNA is still the primary marker used in evolutionary studies, and many studies on human evolution (or archaeogenetics) are published each year. That theme will be the focus of the following section of this Introduction.

Archaeogenetics is the application of molecular genetics to the study of the human past. It is especially concerned with the reconstruction of the dispersal history of humankind. The study mostly relies on DNA data from living populations, as well as a substantial contribution from ancient DNA studies. Phylogeography is the study of the geographical distribution of the lineages in a phylogeny (Avise et al., 1987; Jobling et al., 2003). The underlying principle is that every mutation takes place at a particular point in space and time, and each event can be in theory reconstructed from the distribution. In other words, phylogenetic analysis is applied to geographically labelled samples where it is possible to estimate the number and timings of different colonisation events using the geographic origin of the samples and the time depth of lineages on the genealogical tree. However, this approach is not always straightforward due to the low density of mutations and recombination in autosomes and the X chromosome, and high drift in the Y chromosome. Since mutations occur frequently in mtDNA, the high density of mutations tracks the distribution of lineages through space and over time in higher detail and precision especially when complete mtDNAs are used. By comparing the mtDNA lineages from one region to another, it is possible to infer the direction and timing of dispersals.

 

 

 

 

 

1.1.2.4 Other complementary lines of evidence

 

The phylogeographic analysis works best with a model-based framework that uses complementary evidence from other fields including archaeology, linguistics, climatology, geology, palaeontology and radiocarbon dating. Genetic data cannot alone serve as a predictor for the cultural and linguistic affiliation of its carrier. In other words, phylogenetic analysis is able to show the magnitude of immigration at a particular point of time and location, but there is nothing in the genetic evidence per se that will associate the two (Richards et al., 2002).

Radiocarbon dating by 14C started in the 1950s and continues to be the most widely employed method of inferring chronometric age for late Pleistocene and Holocene age materials (Taylor, 1995). The two main methods employed in radiocarbon dating are decay counting methods (using liquid scintillation of acid-washed or acid-base-acid (ABA), and gas proportional counters) and accelerator mass spectrometry (AMS). AMS requires small sample size and it may be possible to use a pre-treatment method (such as acid-base-oxidation (ABOX) for charcoal, or ultra-filtration for bone) that cannot be applied while retaining a large sample size (Ramsey, 2008). Alternative dating techniques include thermoluminescence and optical dating. Optical dating was used to estimate the time since the quartz sediments were last exposed to sunlight (Huntley et al., 1985; Aitken, 1998).

 

 

 

 

 

 

1.2.3 The mtDNA molecular clock

 

Genetic dating is one of the most crucial aspects of the phylogenetic analysis as it enables us to give a temporal framework to significant events in the evolutionary history of species and populations. Evolutionary events are visualised in the phylogenetic tree as splits from the common ancestor, producing two or more descendent branches. One can estimate the time of the divergence by averaging the number of mutations accumulated along the descendent branches over time and by subsequently applying a (well-calibrated) molecular clock.

Accurate estimation of mutation rates and a proper calibration of the molecular clock are therefore essential for genetic dating. However, in the mtDNA these estimations are somewhat problematic. Firstly, a high mutation rate means that some positions may be hit many times by mutations and some of these events pass undetected in the phylogeny leading to underestimations of the mutation rate. Secondly, there is high heterogeneity in the rate of variation, meaning that some sites mutate very fast indeed whereas other mutate very slowly, if at all (Hasegawa et al., 1993, Stoneking, 2000). This problem was first underlined by Vigilant et al. (1991) who found that the average mutation rate in the control region is five times higher than in the coding region (Bandelt et al., 2006). Since then several attempts have been made to calculate mutation rates both in the coding (Mishmar et al., 2003, Kivisild et al., 2006, Ingman et al., 2000) and control region (Forster et al., 1996, Torroni et al., 1994).Forster et al. (1996) estimated a value one transitions per nucleotide per 20,180 years for the HVS-I control region (1.80 × 10-7 transitions per nucleotide per year), and Mishmar et al. (2003) suggested one transition per 5138 years (1.26 × 10-8 base substitution per nucleotide per year) based on the coding region np 577-16023, which is more than 10 times lower. These mutation rates assumed a clock-like evolution for the human mtDNA with a homogenous distribution of the mutation rate across time. However, the mtDNA phylogeny shows higher proportions of non-synonymous coding mutations at the tips of the branches than deeper in the tree, indicating purifying selection is acting progressively on mtDNA (Kivisild et al., 2006; Pereira et al., 2011). Kivisild et al. (2006) proposed, using a phylogeny consisting of 277 individuals, a mutation rate of one transition in 6884 years for synonymous substitutions only. Soares et al. (2009) provided a revised estimate of the synonymous substitutions rate, and found that it is considerably slower at one synonymous mutation per 7884 years. Synonymous substitutions are not under selection pressure and present lower saturation compared with the control region.

 

Share this post
Repost0
To be informed of the latest articles, subscribe:
Comment on this post