Untitled document
BLAST interpretation
Following questions will require interpretation of BLAST searches.
You may find these NCBI help webpages useful for your reference:
● The BLAST glossary.
● The guide to BLAST home and search pages.
● BLAST video tutorials.
1. How many basepairs of the top sequence in this pairwise alignment are infe
ed to share
positional homology with a basepair of the bottom sequence? Enter your count of such
asepairs as a whole number without units.
2. INSTRUCTIONS:
1. Read the imaginary email below.
2. Run a BLASTN search of an appropriate NCBI database using the DNA sequence
contained in the email as a query.
1. Copy and paste the sequence from the email in the BLAST query box.
2. Select the "Nucleotide collection (n
nt)" database.
3. Select the BLASTN program/algorithm.
4. Under "Algorithm parameters", specify an "Expect threshold" of 1000 in the
text box.
5. Leave all other options at their default setting.
6. Note: If you set the options co
ectly, you will retrieve some hits.
3. Analyze the results of the BLASTN search by applying the concepts discussed in
class.
4. Write a response in the space provided (10 marks in total as indicated). Format you
esponse as a coherent email [1 mark] in your own words. Your response should
include a prediction regarding the homology or analogy of the sequence to the
sequences you identified using BLASTN (if any) [1 mark], the name of the specific
NCBI database that you queried, justification or your prediction based on
interpretation of relevant BLAST outputs including sequence comparison metrics
discussed in class [4 marks], and consideration of the evidence in the context of the
problem described in the email [1 mark].
https:
last.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs
https:
www.ncbi.nlm.nih.gov
ooks/NBK62051
https:
ftp.ncbi.nlm.nih.gov/pu
factsheets/HowTo_BLASTGuide.pdf
https:
www.youtube.com/playlist?list=PL7dF9e2qSW0azL2xOKAtxDW7QI8UU4XZ6
5. Also, include in your response a question for the (imaginary) sender, requesting
information which they might have, and which could help you better infer homology
or analogy [1 mark].
6. Cite any sources you refer to by including the author, year, and Digital Object Identifie
(DOI) number (or website URL if relevant) [1 mark]. At least one source must be cited.
7. Limit your response length to a minimum of 300 words and a maximum of 500 words
[1 mark]. (Hint: Compose first in a google document, count words with the word
count tool, then paste into the response box).
8. Please do not sign your response with your real name. You may use a pseudonym
instead if you wish.
IMAGINARY EMAIL:
__________________________________________________________________________________________________
___
From: XXXXXXXXXX
Date: September 30, 2034
Dear Dr. Student,
I am the chief scientific officer of MarsX. We recently received a transmission from personnel at ou
Martian base station, and they have apparently identified living cells in samples obtained from deep
in the martian subsurface.
To verify whether these are the first forms of martian life discovered, or microbes introduced from
earth, they used a portable sequencing device in an attempt to obtain identifying DNA sequences.
Unfortunately, they ran out of reagents before sequencing was complete. This is the only sequence
they obtained:
001 Unknown martian DNA sequence
TCCGGTGGCAATGGCGGAGG
Our staff biologists cannot tell whether the sequence is from any known te
estrial organism or not.
If not, we assume it could be of martian origin and demonstrate that martians have the same genetic
material as te
estrial organisms. This is a matter of urgent importance for us, as our prediction
egarding the origin of these organisms will determine what experimental equipment we send via ou
next earth-to-mars mission which is scheduled to launch in three days.
Any advice you can provide would be greatly appreciated, and you would be credited appropriately in
any resulting scientific publications.
Thank you in advance,
Sincerely,
Dr. Chuck Darwin
--
Chief Scientific Officer, MarsX
__________________________________________________________________________________________________
___
Sequence similarity clustering case study material
Following questions will refer to the information below.
Below is a table of percent sequence identities calculated from pairwise alignments of 16S rRNA
sequences from 13 species of bacteria in the genus Bacillus (a
eviated as "B." in the table). All the
sequences were compared against all the others. So, for example, the percent sequence identity in
the pairwise alignment of B.
evis 16S rRNA and B. alvei 16S rRNA is 89.2%. The image of this table
is also available in PDF format here: link to PDF.
https:
drive.google.com/file/d/1yoUE0CCjosZDE8uvVzvW2h9doZ3L27cd/view?usp=sharing
Regarding the table of Bacillus 16S rRNA pairwise sequence identities, which Bacillus species
contains the 16S rRNA orthologue with the greatest proportion of identical nucleotide bases at sites
which are infe
ed to be homologous following pairwise alignment with the Bacillus subtilis 16S
RNA sequence?
Select one:
a.
B. stearothermophilus
.
B. laterosporus
c.
B. cycloheptanicus
d.
B. megaterium
B. polymyxa
f.
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4#
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4#
B. marinus
g.
B. coagulans
h.
B. alvei
B.
evis
j.
B. psychrophilus
k.
B. marcerans
l.
B. macquariensis
Based on the table of pairwise percent identities of Bacillus 16S rRNA sequences, which of these
ooted tree topologies would you infer? Assume that greater sequence identity implies more recent
common ancestry. Select all that apply.
Select all that apply:
a.
.
c.
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4#
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4#
d.
Types of homology case study material
Following questions will refer to the phylogenetic tree described below.
The image below is a phylogenetic tree of EF-Tu (EF-1α) and EF-G (EF-2) protein-coding genes.
Branch supports are bootstrap percentages (bootstrap percentages from multiple methods are
shown on some
anches in bold font). The tip labels indicate the genus (or in some cases the
genus and species) of the organism from which the gene sequences were derived. The
ackets on
the right side of the tree indicate the domain of life into which the genera/species are classified: Euk
= Eukaryotes, Arc = Archaea, Eub = Eubacteria (Bacteria). This image is also available in PDF format
here: link to PDF.
https:
drive.google.com/file/d/1dTTN-NutMPkiOpKwF0cYO0po70MvV7e0/view?usp=sharing
Choose the best interpretation of the phylogenetic tree of EF protein genes shown in the case study
material.
a.
This tree is inconsistent with the Eocyte hypothesis, because organisms of the taxonomic domain
Archaea form a paraphyletic group. This topology is consistent with a two-domain tree of life.
.
This tree refutes the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form
a monophyletic group. This topology is consistent with a three-domain tree of life.
c.
This tree refutes the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form
a monophyletic group. This topology is consistent with a two-domain tree of life.
d.
This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea
form a monophyletic group. This topology is consistent with a two-domain tree of life.
e.
This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea
form a polyphyletic group. This topology is consistent with a two-domain tree of life.
f.
This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea
form a paraphyletic group. This topology is consistent with a two-domain tree of life.
g.
This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea
form a paraphyletic group. This topology is consistent with a three-domain tree of life.
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6#
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6#
Choose the statement that best characterizes the phylogenetic tree of elongation factor genes
shown in the case study.
a.
This is an a
itrarily rooted phylogram of homologous genes (which code for elongation facto
proteins) containing two paralogous clades. The bifurcation between these two clades is found in
some topologies infe
ed from bootstrapped alignments. Poorly supported
anches in the topology
of the tree shown are consistent with the root of the tree of life being on a
anch between ancestral
prokaryotes, implying that eukaryotes originated after diversification of prokaryotes.
.
This is an a
itrarily unrooted phylogram of homologous genes (which code for elongation facto
proteins) containing two paralogous clades. The bifurcation between these two clades is found in
some topologies infe
ed from bootstrapped alignments. Weakly supported
anches in the topology
of the tree shown are consistent with the root of the tree of life being on a
anch between an
ancestral bacterium and an ancestral archaeon, implying that eukaryotes originated afte
diversification of bacteria.
c.
This is an a
itrarily rooted phylogram of homologous genes (which code for elongation facto
proteins) containing two orthologous clades. The split between these two clades is found in all
topologies infe
ed from bootstrapped alignments. Strongly supported
anches in the topology of
the tree shown are consistent with the root of the tree of microbial life being on a
anch between
ancestral prokaryotes, implying that prokaryotes originated after diversification of eukaryotes.
d.
This is an a
itrarily rooted phylogram of homologous genes (which code for elongation facto
proteins) containing three paralogous clades. The bifurcation between these three clades is found in
all topologies infe
ed from bootstrapped alignments. Strongly supported
anches in the topology
of the tree shown are consistent with the outgroup of the tree of life being on a
anch between
ancestral prokaryotes, implying that eukaryotes originated after diversification of prokaryotes.
e.
This is an unrooted phylogram of homologous genes (which code for elongation factor proteins)
containing two paralogous clades. The bifurcation between these two clades is found in some
topologies infe
ed from bootstrapped alignments. Strongly supported
anches in the topology of
the tree shown are consistent with the root of the tree of life being on a
anch between ancestral
prokaryotes, implying that eukaryotes originated after diversification of prokaryotes.
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6#
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6#
f.
This is an a
itrarily rooted phylogram of homologous genes (which code for elongation facto
proteins) containing two paralogous clades. The bifurcation between these two clades is found in all
topologies infe
ed from bootstrapped alignments. Strongly supported
anches in the topology of
the tree shown are consistent with the root of the tree of life being on a
anch between ancestral
eukaryotes, implying that prokaryotes originated after diversification of eukaryotes.
g.
This is an a
itrarily rooted phylogram of homologous genes (which code for elongation facto
proteins) containing two paralogous clades. The bifurcation between these two clades is found in all
topologies infe
ed from bootstrapped alignments. Strongly supported
anches in the topology of
the tree shown are consistent with the root of the tree of life being on a
anch between ancestral
prokaryotes, implying that eukaryotes originated after diversification of prokaryotes.
https:
eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6#