Friday, October 2, 2009

Lectures this week

Content:
The Lecture by Dr.Fridsma gave a good understanding on the differences between semantic web and syntactic web, importance of semantic web and applicability of ontologies in the development of semantic web.

Semantic Web is an evolving development of the World Wide Web in which the meaning (semantics) of information and services on the web is defined. The semantic web comprises a set of design principles, collaborative working groups, and a variety of enabling technologies. Some elements of the semantic web include Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL). These elements provide a formal description of concepts, terms, and relationships within a given knowledge domain. On the other hand, Syntactic Web is a phrase that describes the current, mostly HTML-based World Wide Web. In a syntactic web, if the creators of the site ever decide to change around the layout or HTML of the site, the computer program would most likely need to be rewritten in some way. In contrast, if the data is presented semantically, the program could retrieve that semantic data, and the site's creators could change the look and feel of the site without affecting that retrieval ability.
Shuiwang’s lecture mainly focused on various types of clustering. Clustering is the method by which like records are grouped together. Usually this is done to give the end user a high level view of what is going on in the database.
A simple example of clustering would be the clustering that most people perform when they do the laundry - grouping the permanent press, dry cleaning, whites and brightly colored clothes is important because they have similar characteristics. And it turns out they have important attributes in common about the way they behave (and can be ruined) in the wash. To “cluster” your laundry most of your decisions are relatively straightforward. There are of course difficult decisions to be made about which cluster your white shirt with red stripes goes into (since it is mostly white but has some color and is permanent press). When clustering is used in business the clusters are often much more dynamic - even changing weekly to monthly and many more of the decisions concerning which cluster a record falls into can be difficult.
There are two main types of clustering techniques, those that create a hierarchy of clusters and those that do not. The hierarchical clustering techniques create a hierarchy of clusters from small to big. This hierarchy of clusters is created through the algorithm that builds the clusters. There are two main types of hierarchical clustering algorithms:
• Agglomerative - Agglomerative clustering techniques start with as many clusters as there are records where each cluster contains just one record. The clusters that are nearest each other are merged together to form the next largest cluster. This merging is continued until a hierarchy of clusters is built with just a single cluster containing all the records at the top of the hierarchy.
• Divisive - Divisive clustering techniques take the opposite approach from agglomerative techniques. These techniques start with all the records in one cluster and then try to split that cluster into smaller pieces and then in turn to try to split those smaller pieces.

Posted by
Harsha Undapalli

No comments:

Post a Comment

Gentle Reminder: Sign comments with your name.