Frequency Behavior of Color Adjectives in Russian Poetic Texts

The material was received by the Editorial Board: 20.09.2018
Abstract
Linguistic corpora and computer technologies make it possible to search and analyze large amount of unstructured texts. This paper describes in detail the method we used to extract adjectives of color from the poetic texts found in the Poetry corpus of the Russian National Corpus and from various Internet sources. Using a base of 180 lexical units extracted from the poetic texts of 36 authors, we devised a categorization scheme for adjectives of color; this scheme also incorporated data obtained from the Hermitage Museum information system. It includes five categories, with the largest (“Derivatives”) broken down into three subcategories. Our paper further provides quantitative data indicating the extent to which the elements of the different categories are represented in the texts; and from this data we were able to draw a preliminary conclusion vis-à-vis the use of adjectives of color by various authors. Specifically, we compared the frequency of use of basic adjectives of color (белый – white; чёрный – black; красный – red; зелёный – green; жёлтый – yellow; синий – blue; голубой – light blue; коричневый – brown; оранжевый – orange; розовый – pink; and фиолетовый – violet) in the texts of four corpora of the Russian National Corpus, i.e., the Basic corpus, Newspaper corpus, Oral corpus, and Poetry corpus. Our paper describes some patterns of frequency behavior for adjectives denoting colors in Russian poetic texts. We arranged the adjectives retrieved from each corpus in order of decreasing frequency and found that, in all four corpora, the order of the adjectives of color largely correlates with the evolutionary theory of Berlin – Kay which describes the order of appearance of color adjectives in the historical development of different languages. The comparison showed that the frequency of adjectives of color in the Poetry corpus is significantly higher than in the other three corpora. In addition, we searched the information system of the State Hermitage Museum and established that the frequency correlation between adjectives of color and the Berlin – Kay evolutionary model is expressed there weaker than in the RNC corpora. Also, in the course of our study, we found a few semantic tagging errors in the Russian National Corpus. The patterns of frequency behavior of color adjectives revealed in the Russian language may become a reliable basis for further research. Their classification needs more attention too.

Keywords
color adjectives, Berlin – Kay’s basic color terms theory, National corpus of Russian Language, poetic corpus, semantic tagging
References: Andrey Ts. Masevich, Victor P. Zakharov Frequency Behavior of Color Adjectives in Russian Poetic Texts. NSU Vestnik Journal, Series: Linguistics and Intercultural Communication. 17, 1. P. 21–48. DOI: 10.25205/1818-7935-2019-17-1-21-48