An entry in the LEXICON database is illustrated below. Each entry starts with "nr.", a unique identifier of the token at issue: this name is identical to the transcription file name (.xml) and the sound file name (.mp3) in the database

UUM = Urum, ISO 639-3;

-LEX = LEXICON collection;

-01 = identifier of the semantic field (1 to 24);

-10000 = identifier of the concept (following the WOLD standard);

-01 = elicitation session (1 to 4)

The next layers contain the target concept, in English ("wrde."), in Greek ("wrdg."), and in Russian ("wrdr."). The example layers present the sentential frame that we used for the elicitation of the target concept, in English ("exme.") and in Russian, which was the contact language ("exmr."). The next layers contain the obtained data: a native transcription is given in "orth." (following the rules in the Transcription Table), and a phonetic transcription is given in "phon." (only available for 207 words, see Illustrative Phonetic Transcriptions). The layer "comm." contains occasional field notes and the layers "sound." and "meta." contain references to the file names of the sound file and the metadata file respectively. Finally, the layer "auth." contains information about the contributors of the entry at issue.

\nr. UUM-LEX-01-10000-01

\wrde. the world

\wrdg. κόσμος

\wrdr. Мир

\exme. The world is beautiful.

\exmr. Мир красивый.

\orth. dyunya

\phon.

\comm.

\sound. UUM-LEX-01-10000-01.wav

\meta. UUM-MET-00-00001-00.wav

\auth. collection/native transcription: V. Moisidi