Content:Excellent overview of methods used in bioinformatics. The first class talked about the data sources in bioinformatics which was broken down very systematically into DNA, RNA, proteins and metabolites. DNA is studied for one of the following: 1. Sequence variation i.e. Single nucleotide polymorphism (SNP), 2. Epigenetic modification i.e. methylation, deacylation 3. Structural variation i.e. translocation, copy number variation.
RNA is studied using gene expression to identify micro RNA which are associated with the disease.
Proteins are studied using mass spectroscopy to identify protein expression. The main problem with problem with protein expression studies is the size of the protein molecule which is the deterrent for high throughput studies using array technology.
The other data source mentioned was metabolites which are by products of a disease process and can be easily detected. Thus assays detecting these are more robust as people with diseases will most likely have the metabolite.
The second lecture covered the methods used for analysis: 1. Biomarker test 2. Data mining 3. Statistics 4. Sequence alignment
Biomarkers testing includes identification of the maker-->sequencing the marker-->identifying the mRNA expression-->identifying the protein expression level-->validation of the biomarker by screening the same in clinical tumors.
Data mining techniques include 1. Unsupervised algorithm eg k-means, clustering 2. Supervised algorithm: Regression, classification (Random forests, decision trees, Neural networks)
Statistics mainly used in genetic data are linear and logistics regression, t-tests and chi-squared test.
Sequence alignment using dynamic mapping which was difficult to understand from the slide. I found this simple website which explains it very well.
http://www.avatar.se/molbioinfo2001/dynprog/dynamic.html
Posted by
Sheetal Shetty
Friday, October 23, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Gentle Reminder: Sign comments with your name.