CORPUS LINGUISTICS — 2006

Темы Тезисы   Theses Topics Русский/English
   
E. Malaia The present work focuses on ontological efforts to support information security applications — more specifically, engineering natural language processing technology — in the domain of Digital Identity Management (DIM). This paper deals with the methodology and practice in domain acquisition for two of the static knowledge sources, the ontology and the lexicon. We propose a domain-specific topic-source variability matrix, which can be used as an external validity source for ontological description of a «storming» domain. Based on the corpus, we have adopted a two-pronged approach to lexical and ontological domain acquisition: concept-based initial acquisition followed by corpus-based acquisition. The described process enables the acquirers to ensure external validity and internal consistency of the ontology and the lexicon, and aids in faster saturation of the lexicon of a particular domain. While the topic-source subdivision is necessarily domain-specific, the two-prong methodology is applicable to ontological and lexical acquisition for any domain.