BIOL4200
Introduction to Bioinformatics


Tentative Syllabus

  1. Aug 29, 31: Course mechanics, course overview, brief description of Sanger sequencing, shotgun sequncing, genome assembly, nucleic and amino acids, IUPAC uncertainty code, some basic notions from statistics.

  2. Sep 5, 7: No Classes on Sep 5 (Labor Day), Epidomiology, SRI model of COVID-19, functions and plots in Mathematica.

  3. Sep 12, 14: Chapter 1: genetic variation, mathematical models for Mitochondrial Eve hypothesis, introduction to Python programming, Markov chains, binomial and hypergeometric probability distributions, using Mathematica, coalescence, linkage disequilibrium.

  4. Sep 19, 21: Chapter 2: tag SNP selection problem (single nucleotide polymorphism = SNP), how to determine a minimum (or if not, a minimal) number of SNPs that suffice to distinguish different haplotype patterns in the population, introduction to greedy algorithms and integer linear programming.

  5. Sep 26, 28: Chapter 3 -- NCBI, EBI, Rfam database, RNAfold server, simple notions of graph theory, Euler tours and paths, Hamiltonian tours and paths, fragment assembly using De Bruijn graphs.

  6. Oct 3, 5: Chapter 4 -- Dot plots, Dynamic Programming (DP) sequence alignment, genomic applications, CRISPR, HIV-1 genome annotation, overlapping reading frames.

  7. Oct 11, 12: WARNING No Classes (Fall Break) on Monday, Oct 10. For that reason, your Monday classes imeet instead on Tuesday, Oct 11. Chapter 4, continuation, including statistical significance of BLAST hits (normal distribution, p-value, Z-score, shuffled sequence, extreme value distribution), ClustalW multiple sequence alignment, phylogeny created by Clustal, Dijkstra's shortest path algorithm

  8. Oct 17, 19: Chapter 6 -- origin of replication, oriC, DnaA, using Ori-Finder and Ori-Finder 2, NCBI Protein Table (*.ptt) file, conversion from GenBank to PTT format (gbk2ptt: Convert.htm), cumulative density function (CDF), p-value, Z-score.

  9. Oct 24, 26: Chapter 7 -- Modeling regulatory motifs, profile, PSSM (position specific scoring matrix), pseudocounts, weight matrices.

  10. Oct 31, Nov 2: Chapter 8 -- Shannon entropy, construction of sequence logos, analysis of hemagglutinin from influenza virus capsids. MIDTERM on Nov 9 on Chapters 1-4,7.

  11. Nov 7, 9: MIDTERM on Nov 9. Chapter 9 -- Chromosomal rearrangements.

  12. Nov 14, 16: Chapter 10 and class notes -- Phylogenetic tree construction, mitochondrial Eve hopythesis.

  13. Nov 21, 23: Thanksgiving vacation from No Classes (Thanksgiving Recess) Nov 23-25 However, class is mandatory on Nov 21 -- please attend.
    Chapter 11 -- Reconstructing the history of large-scale genomic changes.

  14. Nov 28, 30: RNA structure and function (class notes, since topic not covered in book).

  15. Dec 5, 7: PowerPoint presentations -- 8-minute presentations by 2-person teams on a topic in computational biology. Please be sure to discuss suitability of the topic BEFORE you begin to do background research on your topic. Study days (no classes) on Dec 12, 13.

FINAL EXAM begins at 12:30 p.m. on Monday, December 19, 2022.