Abstract：While traditional grammar mainly employs qualitative methods in linguistic descriptions, quantitative linguistics depends on statistical methods. Quantitative linguistic analysis based on (syntactically-tagged) corpora can not only test the established laws in grammatical studies but also yield more objective typological comparison of different languages and thus more precise description and explanation of the commonality and diversity of human languages.
With the STR treebank built by the Russian Academy of Sciences as data source, this paper extracts data concerning nominal structures with nouns as governors, analyzes statistically the basic types and word-order properties of the nominal structures and dependency relations which nouns can govern, and draws the following precise conclusions from the perspective of quantitative analysis: 1) Russian is a dependent-final language, with 64.41% of the dependents occurring after their governors|2) the major types of nominal structures in Russian are concordant attribute and non-concordant attribute relations, their percentages being 40.49% and 32.02% respectively|3) word orders in the Russian nominal structures tend to follow certain patterns, with A(w)+S(g) structures mostly (96.51%) dependent-initial and the other three types dependent-final. Besides, statistical analysis of the dependency distances of nominal structures has shown that different structures vary in dependency distance, reflecting different degrees of proximity between the words. Analysis of the composition of high-frequency dependency relations has proved that nouns are a word class with the most diversified syntactic functions.
On the basis of a quantitative investigation of Russian nouns, we have achieved three objectives: 1) the established statements in traditional grammar studies have been confirmed|2) these statements have been formulated more precisely|3) certain structural patterns of Russian nouns for which traditional methods are incapable have now been detected. Therefore, quantitative studies of the syntactic properties of word classes not only complement relevant statements in traditional grammar from the quantitative perspective but also are significant for linguistic typology.
As the data extracted from the STR treebank by this paper cover texts of different genres, the conclusions derived from the data are generalizable to the entire Russian language. In future research, comparison can be made of data extracted from texts of different genres so that more fine-grained characterization of the syntactic properties of Russian nouns can be achieved and comparison with other languages can shed light on the commonality and diversity of human languages.