I completed my Master’s in Computer Science at the University of Colorado at Boulder in April 2011. The topic of my Thesis is on the extraction of a meaning ontology and producing an alignment to the extracted ontology. My Master’s Thesis can be found at here. I really enjoyed and continue to enjoy working in the area of the semantic web, natural language processing, and alignment. The abstract of my Thesis is:
“Many legacy relational databases are hidden behind business layers containing semantic in- formation describing the data contained within the tables of the database. With the creation of the Semantic Web some databases have been exposed utilizing this technology, but with a cost. The process of exposing the database to the Semantic Web has not taken off because the manual mapping of the database to the ontology is improbable at a large scale, it is a time intensive process, and to create a domain ontology requires an Ontologist and/or domain expert. Many applications and approaches have been presented over the years to help expose these legacy databases to the Semantic Web. None of these solutions has become widely accepted because they translate all the data to Resource Description Framework (RDF).
This does not work with legacy databases since other systems are still interacting with that data. In addition, systems that translate the data from legacy database to RDF triples do not scale for large databases because a statement or RDF triple is made for every cell within every table. Thus, the amount of information generated from a legacy system that has terabytes of data grows too large to be store in a triple store. Other systems generate an ontology that is a basic representation of the schema and lacking any type of hierarchy or semantic meaning. This thesis proposes an architecture that will semi-automatically extract a meaningful ontology in a timely manner that can scale to handle large database and expose the database as virtual RDF graph by mapping the extracted domain ontology to the database. This will be accomplish by utilizing mapping rules that will evaluate the schema along with the data within the database and utilize existing knowledge base, like DBpedia, in order to find similar ontology classes that match the structure and data within the database. This hybrid approach to ontology extraction and generation of a mapping between the database and extracted ontology does not require an Ontologist, manual mapping, or time intensive work to be done. In addition, the approach can be applied at a larger scale.”
Check it out and I hope you enjoy the information presented in it.