Skip to main content
Terminology 2

Principles of Data Categories (9/10)

In order to have a well-structured termbase, there are some principles regarding data categories which should be followed. These are: elementaritygranularity and interdependence.

Elementarity

According to the first principle, elementarity of data categories, one data element should be assigned to a data category. A structured terminology database holds X number of data categories, but it is fundamental that each data category contains only one piece of information. Elementarity is a basic principle that helps to increase functionality of termbases and to avoid problems when exchanging terminology (e.g. there won't be clear entries if a context example is combined with a definition in one data category entry).

For example, in the following termbase entry, there are two different terms Hinweis and Hinweistext in the field German:

GermanHinweis (e) / Hinweistext

Englishnote(s)

SourceSAGA

Both, Hinweis and Hinweistext should be treated as synonyms and stored in two separate fields. The plural forms in the fields German and English are unnecessary. In case of irregular plural forms, these should be described by a context example or e.g. in a field Remark.

Granularity

The second principle, granularity of data categories, refers to the fact that data categories should be defined as accurately as possible.

In the following example of a termdatabase entry there is no granularity (see the field Grammar which refers to the term genre):

GermanGewinnabführungsverträge

Grammarm, pl

For different types of grammatical information separate fields should be defined, otherwise e.g. filtering based on these categories is not possible:

GermanGewinnabführungsverträge

Genrem

Numberpl

ISO 12620 (1998) offers a series of data categories that are superordinate to other data categories. However, it is important to keep in mind that granularity is language dependent, e.g. the category Inflection is relevant for English nouns and their plural, but for German nouns it will have to be subdivided according to the different cases - nominative (also first or straight case), accusativegenitive (second or possessive case) and dative (third case) (cf. Trippel, 1999).

Interdependence

The last principle is about the interdependence of data categories. Dependency relations between data categories result from their contents. For example:

  • Definition always requires a Source, which can of course not only be literature but also a website or a person (e.g. an engineer, a native speaker, etc.)
  • For a second Definition, the Source field must be repeated.
  • Source of the definition must be distinguished from Source of the term or of the context either by using a specific data category (e.g. Def_Source) or by defining hierarchical relations in the entry structure

These interdependence relations should be considered when designing and defining the entry structure, modelling data categories and entering data.

 

Next