python (sequence analysis)
$8-15 USD / tunti
In 1990, Michael Crichton published the book Jurassic Park about the resurrection of dinosaurs
using the blood from the stomachs of insects which had been encased in amber. At one point in
the book, Dr. HenryWu is asked to explain some of the DNA techniques used in reconstructing
the extinct dinosaur genomes. Dr. Wu describes the use of restriction enzymes and how the
fragmented pieces of dino DNA can be spliced together with these enzymes. He also alludes
to the fact that they don’t have the entire genome but that they ”fill in the gaps” with modern
day frog DNA. At one point during his discussion he points to a computer screen and remarks
”Here you see the actual structure of a small fragment of dinosaur DNA.”
In 1992 Dr. Mark Boguski at NCBI entered this sequence into a text editor and searched all
of the known DNA sequences at the time. Dr. Boguski wrote up his findings and submitted
a manuscript to the journal BioTechniques, as a tongue-in-cheek joke. His manuscript was
accepted and published. (Boguski, M.S. A Molecular Biologist Visits Jurassic Park. (1992)
BioTechniques 12(5):668-669).
You will reproduce this experiment using BLAST. ([login to view URL])
Submit the ”dinosaur DNA” sequence you can find in the file [login to view URL] to a Nucleotidenucleotide
BLAST (blastn) search. How many of the top ten matches are artificial sequences?
Identify any actual organisms in the top ten.
Mark Boguski’s published article was brought to Crichton’s attention. In his second book,
”The Lost World”, Mr. Crichton used Dr. Boguski as a consultant. Dr. Boguski constructed
an interesting sequence from existing species and also embedded a message in the protein
translation of the DNA sequence which he submitted for use in the book.
Once again, invoke Nucleotide-nucleotide BLAST (blastn) with the second ”dinosaur DNA”
sequence you can find in the file dino2.fasta. Identify all organisms of the top ten matches.
Are either of these organisms related to dinosaurs?
Now use Translated query vs. protein database BLAST (blastx) with the same sequence and
the Swiss-Prot data base. Look at the amino acid sequence of the query sequence aligned to the
best hit. What is the hidden massage Dr. Boguski included in this sequence?
Hand in a well documented exercise, that contains the sequences, sources, output alignments
and scores and the parameters used for the algorithms. One major criterion for the grading of
this exercise is reproducibility.
Hint: Use the blastn and not the megablast option. ”PREDICTED” sequences count as hits.
Projektin tunnus: #16087756
Tietoa projektista
Myönnetty käyttäjälle:
Hi I am PhD in bioinformatics, biotechnology and microbiology. I have 7 years of research and writing experience and have worked on PhD level thesis projects, published papers in peer reviewed journals and have also wo Lisää
4 freelanceria on tarjonnut keskimäärin $23/tunti tähän työhön
Hello,Sir How are you? I have read your project description and I am going to develop using DT ,RF,DNN to implement your project. What about