Saturday, October 3, 2009

Week Six: Ontologies/Semantics & Machine Learning

Content:
Overall, as Annie mentions, our colleagues have extensively covered much of the topics that were cover this week by Dr. Fridsma and Mr. Ji. I will just sum up some quick important points I got from the lectures.

Ontologies/Semantic Web-
Dr. Fridsma's presentation was informational and understandable, although it is hard to distinguish lecture from lecture since he taught 3 of the 4 lectures this week. He spoke of the difference of semantics and syntax:
  • Semantics - meaning and understanding
  • syntax - structure
He then spoke of how the exchange of information is syntactic interoperability while the use of the information is semantic interoperability. As well he defines for us that ontology is an engineering construct rather than an underlying truth. This is reasonable since if it was an underlying truth than it cannot be rebutted or tested as noted by Dr. Fridsma. The lecture then went on to all the three letter acronyms such as OWL, OIL, DAML-ONT, and etc. He also distinguised that there were different levels (versions) of OWL implementation:
  • OWL Full - Fully expressive but there are serious problems
  • OWL dl - full power of descriptive objects, it is deliverable
  • OWL lite - easy to implement
The talk also included information of RDFs and its importance in helping to build vocabularies.

Machine Learning: unsupervised, supervised, and semisupervised-
Mr. Ji covered the topics of machine learning but focused primarily on k-nearest neighbor and k-means since these are the simpler of the topics and were appropriate as an introduction into the topic in the short period of the class.
  • Supervised Learning: This involves the topic of k-nearest neighbor to produce a comparative results based on an existing training set. The value of k in this case is very important since if k = all data, then the training set value that makes up the majority of set is returned, if k is too small (ie 1) then the results may not be comparative since it was compared to a small subset.
  • Unsupervised Learning: This involved the idea of grouping similar objects into the same group, and different objects into different groups. We then covered k-means as a method that is used for unsupervised machine learning. Typically this method results in grouping were intradata point distances are minimized and intercluster distances are maximized.
  • Semisupervised Learning: This is a topic where people believe that it is an ideal situation, however, despite looking like they work ideally this situation actually does not work.
In summary, supervised learning is when we want to determine the label of an attribute, but unsupervised learning is when we want to group data points not apply labels.

Other Stuff
I found this interesting article in the news about how social networks that are being used by students are causing them to get into trouble with their schools and others that have resulted in disciplinary action that included expulsion from medical school. This could possible be applied to the other health professions as well.

http://news.bbc.co.uk/2/hi/health/8266546.stm

As well, a slightly off topic article on who is responsible for private information that stems from a bank employee sending customer information to the wrong person that included information about hundred of other customers. As one of our lecturers stated, the finacial system is very similar to how the healthcare system: finacial data versus PHI.

http://blogs.techrepublic.com.com/itdojo/?p=1031&tag=nl.e099.dl090930&tag=nl.e099

Posted by Eric

No comments:

Post a Comment

Gentle Reminder: Sign comments with your name.