Propuesta de modelado de una ontología de dominio para la representación de acciones en política-economía
- Rodrigo Martínez Béjar Director
- Juan Antonio Pastor Sánchez Director
Defence university: Universidad de Murcia
Fecha de defensa: 01 December 2017
- José Antonio Moreiro González Chair
- Francisco Javier Martínez Méndez Secretary
- Ana Alice Rodrigues Pereira Baptista Committee member
Type: Thesis
? ABSTRACT (ENGLISH) INTRODUCTION: A huge amount of political-economical information is disseminated through digital social media by pieces of news. However, these contents and, in general terms, any type of information do not have a homogeneous structure and are published in large amounts. This makes the extraction of formal knowledge difficult. In order to resolve the problems generated by these issues, knowledge organisation and representation tools have been created. The semantic web and specifically the ontologies can improve the representation of the content in documents with a low level of structuring, by adding new elements to the tools traditionally used. But the ontologies-modelling is complex and requires the use of natural language processing techniques -automatic indexation-. In this way, a designed methodology can provide the vocabulary with elements that can describe contents. AIMS AND HYPOTHESIS: In this sense, the general aim of this thesis is modelling an ontology that can contribute to the representation of actions in the political-economical domain and can facilitate the understanding of the real world facts in this context. The hypothesis is if digital press pieces of news, used as a source of knowledge acquisition, and the application of both human indexation and semi-automatic indexation techniques -for extraction of terms- are adequate for modelling the proposed ontology. METHODOLOGY: For this purpose, a methodology has been defined and comprises the following steps: a sample and subsample construction composed of pieces of news of the political-economical domain. This is based in a selection of digital international newspapers; the Grammar Case analysis and its application to the modelling of a general ontology for the description of actions (ONA); both human indexing and semi-automatic indexing to the subsample1 in order to modify ONA and to model a domain ontology for the description of political-economical actions (ONAPE); and the initial mapping with some vocabularies, in order to identify equivalent elements and define classes and properties. ONAPE (and ONA) are evaluated by instantiating ONAPE with some keywords (extracted from source codes of the subsample2 with MetadadosHTML) and by the semantic annotation of a cut of this subsample. Finally, the adaptation of ONAPE (and thus of ONA) to the studied domain is analysed by the application of both accuracy and recall equations. In all cases higher than 0.9 values are obtained. This ensures the correctness and specification of the elements of the modelled ontology. CONCLUSIONS: The designed methodology has proved profitable for the purpose of this thesis. With the Grammar Case as its main theoretical component, the methodology can be used for the modelling of others domain ontologies. However, for these tools remaining useful they need to be updated by analysing new corpora. In relation to MetadadosHTML, there is evidence of the difficulty of the information interchange. This has to do with the fact that there are numerous metadata schemas for the description of pieces of news, but none of them has become a standard. Finally, future works are proposed, such as the use of both ONA and ONAPE in projects focused on the automatic learning software for the automatic description of documents and the specialization of ONAPE in specific subdomains.