Sunday, February 14, 2010

How it all began ....


First of all I must mention that this is a blog that should have been written way back.... kinda like july last year.... but they say it's better late than never , so here we go with the first post :)


I am a final year undergraduate at the University of Moratuwa in the Department of Computer Science and Engineering. This blog is about the Final Year project that I am currently working on with my project team Manisha, Sajith and Rohan.

The project idea is to simulate the natural process of human understanding on reading a document. That is, once text is provided, using a framework we should be able to use its knowledge, to identify places, people, animals and other objects that are mentioned in that text. Basically, we wanted to build a framework that can be used in custom applications to extract semantic data entities from document repositories of diverse formats.

To understand what I mean by the natural process of human understanding on reading I will give you a simple example, If you given the word "Grenelle”, 99.9% of the time, you won't know what it is. But, if you are given a sentence, like, "Able used to live in Grenelle" you would guess that Grenelle is a city or town or country. Ultimately you would understand it as a place. On the other hand, if you are given a more descriptive piece of information, like, "Able used to live in Grenelle and often used to visit the Eifel Tower on Friday nights." Then you would have an idea, that Grenelle is a city or a town in France. This is the thinking ability of a human being. Once a sentence or a piece of text is given as an input to the human brain, it processes the semantics of the whole sentence. It identifies new words, using the other known words in that input. The correctness of the identification depends on the information in the input statement.

Thus our project idea, XemanticA ( eXtracted sEMANTIC data Analyser) is now on its way to achieve this goal.


No comments:

Post a Comment