Phylogeny Fish: Gobiaria

May 24, 2017 | Autor: Cornellius Wijaya | Categoría: Evolutionary Biology, Phylogenetics, Molecular Phylogenetics and Evolution, Fish Phylogeny, gobiaria
Share Embed


Descripción





Phylogeny Fish: Gobiaria
Cornellius Yudha Wijaya
Introduction
Percomorph fish or Spiny-ray fishes are group of fishes that often to be referred as "bush at the top" for their unresolved phylogenetic historical (David and Wiley 2007, Sanciangco et al. 2015). This unresolved historical is shown below in the figure 1.
\








Figure 1. Phylogeny of Percomorph Fishes (Sanciangco et al. 2015)
As the figure 1 shown, there is unresolved polytomy in the percomorph fish phylogeny trees. The only resolved case in here is Ophidiaria and Batrachoidaria which is shown by their monophyletic group with 100% bootstrap. In this research, we are more interested into the polytomy clade especially Gobiaria group and the relation of Gobiaria fish group with the sister group.
Gobiaria are group of fishes consist mostly by gobies but this group also accommodate nurse fishes, cardinal fishes and sleepers. This group include both marine and freswather bottom-living fishes (Berkovitz and Shellis 2017).
Aims
In this research, the aim is to investigate whether Gobiaria indeed shows reciprocal monophyly with their percomorph sister groups or not.
If the Gobiaria shown reciprocal monophyly, this research also aim to investigating this unresolved polytomy by establishing which is the most closely related sister group to Gobiaria.

Materials and methods
In this research the works are divided by two parts. Laboratory works (wet labs) and computational works.
Laboratory work (wet lab)

DNA Extraction
CTAB Buffer with 0.1 mg K proteinase per ml is prepared and preheated at 60oC. Fish samples (GBNI, STSA, LEFR) from alcohol preservation was cut on the tip of the fin and little piece of it was taken. This sample then rinsed by Tris-HCL, pH = 8 before put inside the tube. Inside the tube, 300 μl of preheated extraction buffer was added to the sample and macerated with a pestle. Samples then incubated for 30 minute at 60oC with tube occasionally gently turned. After that, samples extracted by using mixture of chloroform/isoamyl alcohol (24:1) and centrifuged at 7700g for 10 minutes. The aqueous phase (upper phase) then transferred to the new tubes and precipitated by adding 2/3 volumes of isopropanol. Samples was let to stand overnight at room temperature. After overnight, samples centrifuged at 7700g for 10 minutes and the pellet (bottom phase) which is the DNA with white color washed by detergent solution for 30 minutes. Samples then once again centrifuged at 7700g for 10 minutes. Finally, samples air dried and dissolved in the 1 x TE Buffer.
DNA Amplification
Extracted DNA from samples or template DNA is processed to be amplified using three different genes (16S, TBR1, RAG1) with each gene consist of forward and reverse primer.
Table below present the primer that used in this research
Table 1. Primer list
Primer
Sequence
tbr1_F1
TGTCTACACAGGCTGCGACAT
tbr1_F86
GCCATGMCTGGYTCTTTCCT
tbr1_R820
GATGTCCTTRGWGCAGTTTTT
tbr1_R811
GGAGCAGTTTTTCTCRCATTC
RAG1_F1
CTGAGCTGCAGTCAGTACCATAAGATGT
RAG1_R1
CTGAGTCCTTGTGAGCTTCCATRAATTT
RAG1_R2
TGAGCCTCCATGAACTTCTGAAGRTATTT
16sar-L
CGCCTGTTTATCAAAAACAT
16sbr-H
CCGGTCTGAACTCAGATCACGT

PCR tubes prepared by using PCR tube that commercially ready "Ready-to-go PCR beads" which consist of PCR tube with small beads inside. Each sample then put in the tubes with combination of forward primer 1 μl, reverse primer 1 μl, template DNA 2-3 μl and distilled water with the volume 25 μl. Tube mixed by gently flicked at the tube. Tube then vortexed around 2 seconds before centrifuged for 1 minute. Reaction mixture then processed using PCR. PCR program that used is shown below.
Table 2. PCR Program
Process
Temperature
Time
Cycle
Initial Denaturation
95oC
5 Minutes
1
Denaturation
95oC
30 Seconds
35
Annealing
TmoCa
1 Minute
35
Extension
72oC
2 Minutes
35
Final Extension
72oC
5 Minutes
1
Hold
4oC
Indefinite
1
aDepend on the primer. TBR1: 57oC, 16S: 50oC, RAG1: 54oC.
Gel electrophoresis
1 gram of agarose is taken and 100 ml 1 x TAE buffer was added. Mixture then heated carefully until agarose fully dissolved. Agarose then let to be cooled down. 8 μl GelRed added and mixed by swirling it. Solution then poured into the casting mold with combs put immediately. Gel then let to be cooled down for around 30-40 minutes until solid. After solid, combs and steel wedges are removed and 1 x TAE buffer is poured until gel completely covered. In the table, piece of parafilm is put down on it and 2 μl droplet of loading dye for each sample is dropped on the top of the parafilm. 3 μl samples then mixed with the loading dye by pipette it up and down. After that mixed samples loaded in to the well. In one of the empty wells. 2 μl DNA ladder is loaded. After done for all samples, lid is closed and electrophoresis started at 100V for 30 minutes. Gel then taken to the UV-cupboard to check the DNA bands by looking at the DNA that glowing on the UV-board. For samples with TBR primer, nested PCR is done by repeating the PCR protocol.
PCR product cleaning
PCR product with good sequence (good visibility in the gel) is cleaned using ExoSAP-IT. Inside 0.2 ml tubes, 15 μl of PCR product is mixed with ExoSAP-IT with proportion 5:2. Tubes then put inside the PCR and run with the PCR program 37oC for 15 minutes and 80 oC for 15 minutes.
Preparing tube to send for sequencing
Inside 1.5 ml tubes, each 5 μl of PCR product is added 5 μl of their respective primer (forward or reverse). After that, the tubes then labeled using label that been prepared before.
Computational works
Sequences result is stored in FASTA files. Multiple alignment sequence then worked on the program called Pregap4 and Gap4. After multiple sequence is worked on, the data then concatenated using MacClade, Text Wrangler and the terminal program. Parsimony analysis then done using PAUP. To test the model for the analysis, JModelTest program is used. After that, Bayesian analysis is done using program called mrBayes. Finally, to evaluate the Bayesian analysis, the program that was used is mrBayes, Tracer and FigTree.
Result
Laboratory Works (Wet Lab)
Gel electrophoresis shown that the result of DNA extraction and amplification of 16S and tbr1 genes is success for all 3 gobiaria fish samples (GBNI, STSA, LEFR) however for RAG1 gene only GBNI fish samples showing success. The gene that was success is then sent to macrogene.
Computational Works
Sequence Assembly
Sequence result from macrogene that was acquired then separated by their sequence quality whether it was good or bad. Good sequence should have stable base frequencies without many overlapping base signals. Good sequence consisting of Forward and Reverse sequences of the genes then assembled using application called Pregap 4 and Gap. Gap or Genome assembly program is one of the program to done sequence assembly, there is other program such as Staden, SEQAID, CAP and many more depend on the sequence we are interested (Bonfield et al. 1995, Scheibye-Alsing 2009). In development of the program, there is also supporting program such as SAET to correcting the read of the sequences (Masoudi-Nejad et al. 2013). Sequence assembly basically doing step from reading the sequences, overlapping detection of the read, contig forming from the read, multiple sequences is aligned and consensus of the contig is formed and manual check of the assembly (Scheibye-Alsing 2009).
After throughout check of the acquired sequence, there is problem with the acquired sequences from the macrogene. To compensate for this, the sequences data is imported from the NCBI.
Concatenated data set and multiple sequence alignment
60 sequences of 3 genes (16s, tbr1, RAG1) from fish group Anabantaria, Carangaria, Gobiaria, Pelangaria, Syngnatharia is acquired through NCBI. 2 outgroup Gyrinochelius aymonier and Hemmigramus erythrozonus is chosen. This outgroup is chosen because this species is belong to the Otophysa group which not related closely to the ingroup that being researched.
Concatenated data set means that to combine data from 3 genes from all the species into one data set so it could be processed for multiple sequence alignment. This can be done by using different program and combined it into one. The concatenated data at first need data of each gene to be aligned. This could be done using software program called AliView. There are many software programs to done the multiple sequence alignment such as ClustalX, Mesquite, SeaView and many others but AliView have advantages compared to the others program namely the capability to handle large file with faster loading and smaller memory resources demand (Larsson 2014). Using this program each genes need to be trimmed, to the span of same homologous sequence and run the multiple alignment program inside AliView such as Muscle. After each aligned gene sequences are aligned, then it is concatenated into one big file using terminal program.

Parsimony analaysis
Phylogeny analysis then done for each genes and combined sequences by using program called PAUP*. This program is used to analysis sequences data by using parsimony criterion. Maximum parsimony criterion is one of the most famous criterion to be used in evolutionary analysis. The hypotheses of the parsimony criterion is that the best tree has the minimum number of changes along the edges of trees (Jin et al. 2007). The result of this analysis would be shown as a parsimony tree with variability analysis using bootstrap which shown in the node. The result of the analysis could be seen in figures below.
Legend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - OtophysaLegend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Figure 2. Parsimony tree on 16S genes
Legend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - OtophysaLegend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Figure 3. Parsimony tree on tbr1 genes
Legend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - OtophysaLegend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Figure 4. Parsimony tree on RAG1 genes
Legend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - OtophysaLegend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Figure 5. Parsimony tree on combined genes
The result shown that in each tree that the gobiaria shown a reciprocal monophyly but the relation for the group with the sister group is mostly unresolved. In the 16S parsimony tree it is shown that gobiaria is closer to the carangaria group, tbr1 parsimony tree also shown gobiaria closer to the carangaria and pelagaria even though it is in paraphyletic, the RAG1 and combined parsimony trees does not show any clear relation for the closer related group. All of the tree still shown polytomy although for RAG1 parsimony tree and combined parsimony tree gobiaria group could be seen not inside of the polytomy.
To check whether this tree is significant or not ILD (Incongruence Length Difference) test is done on these tree. This test is done by simulating DNA sequences under various evolutionary conditions (Darlu and Lecointre 2002). ILD test shown result 0.01 which is significant. It means that this tree analysis could be trusted.
Model test
Before Bayesian analysis is done, the substitution model test is first done to testing which substitution model are the best for each gene to be used in the analysis. To run this test, program called JModelTest is used. Many other programs could be used for this test, there are also free online web to done online analysis. Result of the test is shown in table below.

Table 3. JModelTest program result
Gene
Model
Distribution
Substitution
16S
GTR
Gamma
Inverse
tbr1
GTR
Gamma
Inverse
RAG1
HKY
Gamma
-

Result that acquired from this test then would be incorporated into the Bayesian analysis.

Bayesian analysis
In this analysis, program called mrBayes is used to run the Bayesian analysis. Bayesian analysis is optimation criterion to solve the problem in the Maximum Likelihood analysis where the statistical result could be resulting in the zero or negative result which does not making any sense. Bayesian analysis would take account of prior distribution so the end result would only the reasonable one (Hirose et al. 2011). The parameter that used in this is mrBayes default program with MCMC every 10.000.000 generation and uninformative prior probability. The result is shown in tree below
Legend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - OtophysaLegend: Blue: CarangariaRed: PelagariaGreen: AnabantariaPurple: SyngnathariaBlack: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Legend:
Blue: Carangaria
Red: Pelagaria
Green: Anabantaria
Purple: Syngnatharia
Black: Outgroup - Otophysa
Figure 6. Bayesian tree on the combined genes
Bayesian tree shown that gobiaria is reciprocal monophyly with the group is closely related to the carangaria and syngnatharia group. The polytomy is still unresolved.
Evaluating bayesian analysis
Bayesian analysis shown that for 2 independent, parallel markov chain run is have all parameter converged with adequate burn in. The combined markov chain also shown converged parameter with adequate burn in which shown that the markov chain already run long enough. From this analysis is also shown mutation rate for each gene. The result is shown at table below.
Table 4. Mutation rate of each gene
Gene
Mutation Rate
16S
1.857
tbr1
0.675
RAG1
0.328



Discussion
The analysis from parsimony and bayesian all shown that gobiaria indeed a monophyletic group with some of the tree shown that gobiaria mostly closed to the carangaria group although with a low support (low bootstrap score and unresolved polytomy). This result shown that different loci have some similar evolutionary history but not whole are similar. 16S gene shown that gobiaria is close to the carangaria group. tbr1 gene shown that gobiaria are close to carangaria and pelagaria group although in the paraphyletic relation. RAG1 and the combined overall does not show gobiaria to have clear relation with any of the group. Bayesian tree on combined tree shown that gobiaria closely related carangaria and syngatharia group.
From the result it shown that 16S gene has highest mutation rate and RAG1 is the lowest. Lower mutation rate mean that the gene is more conserved across species which would give similar base gene sequence when aligned. With similar base sequence, the tree resolution should be higher and giving a clear resolved relation between the species. As shown in individual gene parsimony tree, 16S gene has most polytomy in the tree which is expected because of the higher mutation rate but is not expected from the gene itself. 16S gene is mitochondrial gene and it is should be highly conserved. This could be happen because of the environment of the fish where they live affecting the gene. To resolve this case, one could add more taxa or have real data from the wet lab rather than from NCBI.
Comparing between two criterion that is used in this analysis. Parsimony has a simple analysis and easy to use but it was not flexible to be used and no parameter to be incorporated so it the best tree that this analysis has might not be the true tree. Bayesian in contrary could incorporated parameter and model to be used in the analysis, it is also statistically consistent although it too dependent on the model and has a slow analysis. In this research, both analysis show how they almost similar result although bayesian analysis has more clear resolution on the tree compared to the parsimony analysis.
In conclusion, gobiaria shown clear reciprocal monophyly and the group is closely related to the gobiaria although the polytomy is still not resolved. In the future, it would be better to use more taxa for the data, more loci to be used, has primary data from the wet lab and use informative prior for the bayesian analysis.
References
Berkovitz B, Shellis P. 2017. The Teeth of Non-Mammalian Vertebrates. Elsevier Inc., London.
Bonfield JK, Smith KF, Staden R. 1995. A New DNA Sequence Assembly Program. Nucleic Acids Research 23(24): 4992-4999.
Darlu P, Lecointre G. 2002. When Does The Incongruence Length Difference Test Fail?. Molecular Biology and Evolution 19(4): 432-437.
Hirose K, Kawano S, Konishi S, Ichikawa M. 2011. Bayesian Information Criterion and Selection of the Number of Factors in Factor Analysis Model. Journal of Data Science 9: 243-259.
David JG, Wiley EO. 2007. Percomorphs 09 January 2007: http://tolwev.org/Percomorpha/52146/2007.01.09. Accessed 13 January 2017.
Jin G, Nakleh L, Snir S, Tuller T. 2007. Inferring Phylogenetic Networks by the Maximum Parsimony Criterion: A Case Study. Molecular Biology and Evolution 24(1): 324-337.
Larsson A. 2014. AliView: A Fast and Lightweight Alingment Viewer and Editor for Large Datasets. Bioinformatics 30(22): 3276-3278.
Masoudi-Nejad A, Narimani Z, Hosseinkhan N. 2013. The Assembly of Sequencing Data. SpringerBriefs in Sytem Biology 4: 41-54.
Sanciangco MD, Carpenter KE, Betancur-R R. 2015. Phylogenetic Placement of Enigmatic Percomorph Families (Teleostei: Percomoprhaceae). Molecular Phylogenetics and Evolution 94: 565-576.
Scheibye-Alsing K, Hoffman S, Frankel A, Jensen P, Stadler PF, Mang Y, Tommerup N, Gilchrist MJ, Nygard AB, Cirera S, Jorgensen CB, Fredholm M, Gorodkin J. 2009. Sequence Assembly. Computational Biology and Chemistry 33(2): 121-136.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.