Streptococcus mutans is the causitive agent for tooth decay and gingivitis in humans, and M102 is a bacteriophage which targets S. mutans. S. mutans uses CRISPR/Cas to "acquire immunity" from M102, which leads to a selective advantage for M102 mutants having mutations in the genomic region that S. mutans cleaves and integrates into its CRIPR array. All of this is happening right now in all of our mouths! See paper by Jan R. van der Ploeg of the University of Zurich. Also, see the papers of the recipients, Emmanuelle Charpentier and Jennifer Doudna of the 2020 Nobel Prize in Chemistry for the application of CRISPR to gene editing:
In the upper left corner, use the browse tool to input the FASTA format file containing the S. mutans genome. Now run CRT and print the output. Since the location of the repeats and the spacers are indicated (spacers are integrated viral genomic portions), you can then look at the GenBank format file of the S. mutans genome, to see if there is any appropriate annotation.
If Java is not installed on your computer, you can find an appropriate version to download on your computer, by typing "download java development kit" or "download jdk" into the Google search engine.
Run CRT on the FASTA sequence of S. mutans to determine (likely) CRISPR repeats and spacers. Print out the results. Consult the GenBank file of S. mutans, and give the annotation for the region overlapping the predicted CRISPR region.
Solution:
complement(1663207..1663759)
/note="potential frameshift: common BLAST hit:
gi|337281888|ref|YP_004621359.1| CRISPR-associated protein
cas1"
which indicates the location of the gene for the cas1 protein; however,
no CRISPR array location is given for incorporation of viral-derived
spacers separated by repeats.
Since the annotation of the same region is as follows,
gene 132644..134389
/gene="adhD"
/locus_tag="SMU_130"
/old_locus_tag="SMU.130"
/db_xref="GeneID:1029705"
there is a disagreement between GenBank annotation and the output of CRT.
This indicates that (1) databases sometimes contain incorrect
or incomplete annotations, (2) the output of prediction software can be
incorrect, or (3) both. To determine which is most likely correct requires
an investment of much more time, knowledge of the adhD gene annotation
(i.e. how the annotation was made, via BLAST or other software, the
E-score of statistical significance if BLAST was used, whether related
bacteria have similar annotations, the function of the adhD gene, etc.),
as well as knowledge of how the CRISPR prediction was made and its
statistical significance. A final possibility is that both annotation and
prediction are correct, and that the CRISPR array is found in the 5'
untranslated region (5'-UTR) of the adhD gene.
Note that the GenBank annotation is as follows:
CDS 176037..178448
/locus_tag="SMU_180"
/old_locus_tag="SMU.180"
/note="Best Blastp Hit: pdb|1D4C|A Chain A, Crystal
Structure Of The Uncomplexed Form Of The Flavocytochrome C
Fumarate Reductase Of Shewanella Putrefaciens Strain Mr-1
>gi|6573310|pdb|1D4C|D Chain D, Crystal Structure Of The
Uncomplexed Form Of The Flavocytochrome C Fumarate
Reductase Of Shewanella Putrefaciens Strain Mr-1
>gi|6573308|pdb|1D4C|B Chain B, Crystal Structure Of The
Uncomplexed Form Of The Flavocytochrome C Fumarate
Reductase Of Shewanella Putrefaciens Strain Mr-1
>gi|6573309|pdb|1D4C|C Chain C, Crystal Structure Of The
Uncomplexed Form Of The Flavocytochrome C Fumarate
Reductase Of Shewanella Putrefaciens Strain Mr-1"
/codon_start=1
/transl_table=11
/product="oxidoreductase"
/protein_id="NP_720649.1"
/db_xref="GI:24378694"
/db_xref="GeneID:1029753"
Does this mean that M102 also has the same gene for oxidoreductase?
No, probably not, since the aligned region is too small. Again an investment
of time is necessary to gain knowledge to better understand the biological
reasons behind this almost identical segment found in both the virus and
bacterium.
Solution:
Please answer the following.
Solution:
pValue = N[1/Sqrt[2 Pi] Integrate[ Exp[-x^2/2], {x, 2, Infinity}]]
to get the answer 0.0227501.
pValue = N[1/Sqrt[2 Pi] Integrate[ Exp[-x^2/2], {x, 6, Infinity}]]
which equals 9.86588 e-10.
pValue = N[Integrate[PDF[ExtremeValueDistribution[2.5, 1.05], x], {x, 6, Infinity}]]
which equals 0.0350452.
Solution:
s is :CGAA t is :GGACC G G A C C 0 0 0 0 0 0 C 0 0 0 0 1 1 G 0 1 1 0 0 0 A 0 0 0 2 1 0 A 0 0 0 1 0 0where again, the arrows aren't listed since it's difficult to indicate by typing; however, when we go over this in class, I will indicate the arrows, and in any case your solution SHOULD have the arrows.
- G A - - G A - -where the dashes are NOT part of the local alignment, but only indicated to allow you to see that the first sequence was CGAA of length 4 and the second sequence was GGACC of length 5.