教学目标:
近年来,生物学数据呈爆炸式增长。例如,包括了从细菌到人类的许多有机体的全基因组序列;部分RNA与诸多蛋白质的结构;以及利用芯片技术获取的成千上万个基因的表达谱。要将如此海量的数据转变为有用的生物学知识,除了众多计算与统计工具外,还需要新概念新视野。这个研究目标的进展,有赖于一系列统计物理学范畴的概念和方法,包括最优化,分区,模式识别与群体行为。统计物理学的中心任务是描述复杂行为如何在大量基本元素的相互作用中出现,因此,统计物理学的概念与工具在生物信息学中应该是颇有价值的。本课程旨在介绍并探讨物理学与生物学接合点的若干主题。
预备先修课程
统计力学 (8.333).
阅读资料
本课程阐述的内容不遵循某本教科书。下述书籍可在麻省理工学院中心图书馆与物理系的阅览室找到:
细胞分子生物学:Alberts等著,第三版。经典的现代生物学参考书.
生物序列分析:Durbin等著。 介绍了生物信息学中的一些标准计算方法
解密DNA:Frank-Kamenetskii著。阐述了物理学家感兴趣的若干相关问题。
作业
课外作业是课程的一个重要部分,其总平均得分将占期末成绩的80%。你写出了自己的答案后,可以与“学习小组”的同学讨论一下。
8次课外作业的时间安排包括提交日期将会在网上公布。随课程进行,将会建立实际问题集与答案的超级链接。作业需在提交日下午5点前递交。可在课堂上交作业,或者放到指定的作业架上。
答案公布后,一概不接受问题集的提交。有合理理由且在答案公布前提交作业,可能会被接受,导师会酌情扣分。
期末专题占期末成绩的20%,并且应该在第12课前经与讲师磋商后计划好。
评分
总成绩取决于:
-
课外作业:每次作业占10%,共八次
-
期末专题:20%
期末公布的成绩将反映我们以最大努力来客观评价你的课程表现:
A:表现优异。表现出课程主题的深入理解,广泛的知识基础,能熟练运用有关概念及材料。
B:表现良好。表现出适当运用概念的能力,理解课程主题,能处理课程中遇到的问题与材料。
C:表现一般。表现出基本了解课程主题,能处理相对简单的问题,在进入该领域的更高级研究上有一定准备。
D:表现仅为合格。表现出对课程的主题至少有一些熟悉,有时能处理某些相对简单的问题,但尚不足以从事该领域的深入研究,需进一步学习。
F:不及格:这成绩意味着学生需重修方能获得学分。
大纲
-
课程简介(资料与构架)
-
分子进化
-
概率论
-
基因注释与相似性测定
-
序列比对与统计物理学
-
替代矩阵
-
聚合物的统计物理
-
DNA
-
RNA
-
蛋白质
-
随机能量模型
-
带电聚合物
-
设计蛋白质
-
结构元件
-
蛋白质-核酸复合物
-
血红蛋白
-
微管,马达与微管-马达模式
-
分子马达与随机动力学
-
细胞运动与模拟网络
-
血红蛋白的载氧
-
微阵列技术与网络绪论
-
网络动力学
-
不动点与振荡
-
生物学模式
Aim of the Course
There has been an explosion of biological data in the past few years, such as the complete genome of many organisms from bacteria to human, the structures of some RNA and numerous proteins, and the expression profiles of thousands of genes by chip technology. Converting this enormous data to useful biological knowledge requires a multitude of computational and statistical tools, as well as novel conceptual perspectives. Progress in this task requires knowledge of a number of issues such as optimization, partitioning, pattern recognition, collective behavior, which are in the domain of statistical physics. Since the central task of statistical physics is to describe how complex behavior emerges from interaction of large numbers of basic elements, its tools and concepts should be valuable in bioinformatics. The aim of this course is to introduce and explore some topics at the interface of physics and biology.
Prerequisites
Statistical Mechanics (8.333).
Readings
The presentation of material does not follow a specific textbook for this course. The following books were reserved for the course at the main MIT Library, and the Physics Reading Room:
Alberts, et al. Molecular Biology of the Cell. 3rd ed. It is a standard reference to modern biology.
Durbin, et al. Biological Sequence Analysis. It describes some of the standard computational methods used in bioinformatics.
Frank-Kamenetskii. Unraveling DNA. It presents several relevant topics in a way that should appeal to physical scientists.
Assignments
The homework assignments are an important part of this course, and the overall average homework score will count for 80% of the final grade. You may consult with classmates in "study groups," as long as you write out your own answers.
The complete schedule of assignments (there will be 8) with due dates is available online. Hyperlinks to the actual problem sets and solutions will be created as the term progresses. Problem sets are due by 5:00 pm on the due date. They can be turned in at lectures, or in to the appropriate homework cubby.
No problem sets will be accepted after the solutions have been posted. Late problem sets (before solutions are posted) may be accepted (with legitimate excuses) for a reduced grade as the discretion of the instructors.
A Final Project will count for 20% of the final grade, and should be planned in consultation with the lecturers by ses #12.
Grading
Final grades will be determined from:
Your final letter grade will reflect our best attempt to evaluate objectively your performance in the course:
A: Exceptionally good performance, demonstrating a superior understanding of the subject matter, a foundation of extensive knowledge, and a skillful use of concepts and/or materials.
B: Good performance, demonstrating capacity to use the appropriate concepts, a good understanding of the subject matter, and an ability to handle the problems and materials encountered in the subject.
C: Adequate performance, demonstrating an adequate understanding of the subject matter, an ability to handle relatively simple problems, and adequate preparation for moving on to more advanced work in the field.
D: Minimally acceptable performance, demonstrating at least partial familiarity with the subject matter and some capacity to deal with relatively simple problems, but also demonstrating deficiencies serious enough to make it inadvisable to proceed further in the field without additional work.
F: Failed. This grade also signifies that the student must repeat the subject to receive credit.
Outline
-
Introduction to Course (Material and Organization)
-
Molecular Evolution
-
Probability Theory
-
Gene Annotation and Similarity Detection
-
Sequence Alignment and Statistical Physics
-
Substitution Matrices
-
Statistical Physics of Polymers
-
DNA
-
RNA
-
Proteins
-
Random Energy Model
-
Charged Polymers
-
Protein Design
-
Structural Elements
-
Protein-Nucleic Acid Composites
-
Hemoglobin
-
Microtubules and Motors, and Motor-tubule Patterns
-
Molecular Motors, and Stochastic Dynamics
-
Cell Motion, and Modeling Networks
-
Oxygen Binding in Hemoglobin
-
Introduction to Networks, and Micro-array Technology
-
Network Dynamics
-
Fixed Points and Oscillations
-
Biological Patterns