There are mainly two types of learning: supervised and unsupervised. In supervised learning, there is something that will "guide" a computer to give the best result. In unsupervised learning, there is no such thing like "guiding". According to the lecture, there are mainly four major topics in machine learning: classification, clusturing, regression, and semi-supervised learning.In classification, we would want to classify things (anything: yes/no, good/bad, healthy/unhealthy and so on). We use several technique to train a computer to classify those things. We first create a model using training data. The model, which is designed based on the training data, is then used to classify new data (test data). In case of surgical training simulator, if a surgeon performs a task, we would want to classify his performance as good or bad (let's not consider fuzzy answers). To evaluate the result, we need to check the results of previous similar cases and notice some key parameters like how it's done, how long did it take etc. Based on the majority of the results, we can classify the performance of the new surgeon. The results obtained from previous surgery cases are referred to as "training set", and the one that we wanted to classify is "test set". There are so many tools that can be used for classification. Few of them, which are mentioned in the class are: k-nearest neighbor, neural networks (artificial), naive bayes classifier, svm etc. Naive bayes classifiers are used in spam filtering purposes (like in spam-assasin). SVM can be used to classify non linear classification problems (like XOR-gates). Neural Networks are mainly used in computer vision to train a model to recognize some parts in images. Classification falls under supervised learning category.
In unsupervised learning, clustering is one of the popular techniques.In this technique, we take observations and put into subsets(clusters) in a way that the observations are similar in some sense (wiki). We didn't go much in detail during the lecture. It would be interesting to know in detail about clustering and semi-supervised learning in his next class.
The next lecture by Dr. Petiti was on study design, which I found very informative. She talked about various design techniques and how we can manipulate the information in an experiment. We went in detail of descriptive, observational, experimental and quasi-experimental studies. The main thing that I understood from the lecture was that experiments are randomized. In fact, experiments mean randomization. If experiments are not randomized, we cannot trust the results from those experiments. But, randomized experiments are difficult to do for a few reasons. The examples given in the lecture slides were practical reason ("we can't randomize smoking") and ethical reason("can't randomize cocaiine use). The classic studies presented during the lecture were interesting. For each of design study, there was a classic study, and they really made things easier to understand.
Posted by
Prabal Khanal
No comments:
Post a Comment
Gentle Reminder: Sign comments with your name.