Discovery of Kolmogorov Scaling in the Natural Language
<p>(<b>a</b>) Histogram of 90,094 concordances <span class="html-italic">C</span> in the data-base, shown by various sizes <span class="html-italic">N</span> measured by total word count; (<b>b</b>) Fractions <math display="inline"> <semantics> <mrow> <mi>p</mi> <mo>/</mo> <mi>N</mi> </mrow> </semantics> </math> and <math display="inline"> <semantics> <mrow> <mi>q</mi> <mo>/</mo> <mi>N</mi> </mrow> </semantics> </math> of counts of distinct words and, respectively, distinct words in a truncated dictionary <span class="html-italic">D</span> of the 10,000 most common words.</p> "> Figure 2
<p>(<b>a</b>) <math display="inline"> <semantics> <mrow> <msup> <mi>N</mi> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <msup> <mi>σ</mi> <mn>2</mn> </msup> </mrow> </semantics> </math> correlation with <span class="html-italic">R</span> (all concordances, blue dots) show scaling with <span class="html-italic">R</span>. This correlation is effectively linear for most of <span class="html-italic">R</span> (red curve), evidencing Miller’s conjecture on the equivalence of variance with information. However, nonlinearity sets in when <span class="html-italic">R</span> is large (black curve); (<b>b,c</b>) <math display="inline"> <semantics> <mrow> <mi>Y</mi> <mo>=</mo> <mi>σ</mi> <msup> <mi>R</mi> <mrow> <mo>-</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> </mrow> </msup> </mrow> </semantics> </math> scales with <math display="inline"> <semantics> <msup> <mi>N</mi> <mrow> <mn>1</mn> <mo>/</mo> <mn>2</mn> </mrow> </msup> </semantics> </math> with an essentially Gaussian distribution in fluctuations.</p> "> Figure 3
<p>Power law behavior (<a href="#FD19-entropy-19-00198" class="html-disp-formula">19</a>) in the mean <span class="html-italic">R</span> and variance <math display="inline"> <semantics> <msup> <mi>σ</mi> <mn>2</mn> </msup> </semantics> </math> in a data set of concordances (<a href="#entropy-19-00198-f001" class="html-fig">Figure 1</a>) as a function of <math display="inline"> <semantics> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> <mo>/</mo> <mi>N</mi> </mrow> </semantics> </math>.</p> "> Figure 4
<p>Information rate efficiency <math display="inline"> <semantics> <mrow> <mi>R</mi> <mo>/</mo> <mi>U</mi> </mrow> </semantics> </math> as a function of concordance size <span class="html-italic">N</span>.</p> ">
Abstract
:1. Introduction
2. A Data-Base of Concordances
3. Kolmogorov Scaling in Energy Flow
4. Power Law and Kolmogorov Scaling in Information Flow
5. Conclusions
Acknowledgments
Conflicts of Interest
References
- Cisco. The Zettabyte Era: Trends and Analysis, 2014. Available online: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/VNI$_-$Hyperconnectivity$_-$WP.pdf (accessed on 27 April 2017).
- Cisco Visual Networking Index: Forecast and Methodology, 2015–2020. Available online: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-indexvni/complete-white-paper-c11-481360.pdf (accessed on 27 April 2017).
- British National Corpus, Oxford Text Archive, University of Oxford. Available online: http://www.natcorp.ox.ac.uk/ (accessed on 27 April 2017).
- Kulig, A.; Drozdz, S.; Kwapien, J.; Oswiecimka, P. Modelling subtle growth of linguistic networks. Phys. Rev. E 2015, 91, 032810. [Google Scholar] [CrossRef] [PubMed]
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Shannon, C.E. Communication in the presence of noise. Proc. IRE 1949, 37, 10–21. [Google Scholar] [CrossRef]
- Wisbey, R. Concordance Making by Electronic Computer: Some Experiences with the “Wiener Genesis”. Mod. Lang. Rev. 1962, 57, 161–172. [Google Scholar] [CrossRef]
- Miller, G.A. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 1956, 63, 81–97. [Google Scholar] [CrossRef] [PubMed]
- Mehri, A.; Lashkari, S.M. Power-law regularities in human language. Eur. Phys. J. B 2016, 89, 241. [Google Scholar] [CrossRef]
- Jakobson, R.; Frant, C.G.M.; Halle, M. Preliminaries to Speech Analysis: Features and Their Correlates; MIT Press: Cambridge, UK, 1961. [Google Scholar]
- Batchelor, G.K. The Theory of Homogeneous Turbulence; Cambridge University Press: Cambridge, UK, 1953. [Google Scholar]
- Kolmogorov, A.N. The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers. Proc. R. Soc. Lond. A 1991, 434, 9–11. [Google Scholar] [CrossRef]
- Orszag, S.A. Analytical theories of turbulence. J. Fluid Mech. 1970, 41, 363. [Google Scholar] [CrossRef]
- Van Putten, M.H.P.M. Method to Search Objectively for Maximal Information. U.S. Patent 20130191365A1, 25 July 2013. [Google Scholar]
- Van Putten, M.H.P.M. Available online: www.iTopSearch.com (accessed on 27 April 2017).
- Mathieu, J.; Scott, J. An Introduction to Turbulent Flow; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
- Nieuwstadt, F.T.M.; Boersma, B.J.; Westerweel, J. Turbulence—Introduction to Theory and Applications of Turbulent Flows; Springer: New York, NY, USA, 2016. [Google Scholar]
- Van Putten, M.H.P.M.; Guidorzi, C.; Frontera, F. Broadband turbulent spectra in gamma-ray burst light curves. Astrophys. J. 2014, 786, 146. [Google Scholar] [CrossRef]
- Statistics and Machine Learning Toolbox, MathWorks Inc. Available online: https://www.mathworks.com/stats/index.html (accessed on 27 April 2017).
- Van Putten, M.H.P.M. Bilingual Search Engine for Mobile Devices. U.S. Patent 20160004697A1, 7 January 2016. [Google Scholar]
Index | Word | Probability | Comment |
---|---|---|---|
386 | apple | 0.0256 | Upper and lower case |
6481 | perfect | 0.1525 | " |
6034 | no | 6.7556 | " |
9946 | yes | 2.7188 | " |
- | Woolsthorpe | 0 | Newton’s city of birth, not in Merriam-Webster |
Rank | Concordance | R | |
---|---|---|---|
1 | .. splash of brandy. My homemade apple pie is like a siren call to my family. All I have to do is pick up the phone and say “pie” to my father and he’s here in less time than it takes to clear a place at the table. You know when people .. | 1.1202 | 1.6275 |
2 | .. of pumpkin and apple together just make my heart happy. The photos are gorgeous and your lattice is freaking perfect! :) I have never been to an apple orchard either but I always envision it as a marvelous occasion. Maybe one day I will go! Pinning this pie for future reference ;) Reply .. | 1.0987 | 1.5144 |
... | ... | ... | ... |
10 | .. that I’ve baked apple pie, this recipes was easy to follow AND most importantly it came out delicious. Received lots of compliments on this so Thank You!!!! Curious to know if there are any supplements for the sugar though, trying to make a version for my parents who are trying to cut back .. | 1.0263 | 1.5126 |
© 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Van Putten, M.H.P.M. Discovery of Kolmogorov Scaling in the Natural Language. Entropy 2017, 19, 198. https://doi.org/10.3390/e19050198
Van Putten MHPM. Discovery of Kolmogorov Scaling in the Natural Language. Entropy. 2017; 19(5):198. https://doi.org/10.3390/e19050198
Chicago/Turabian StyleVan Putten, Maurice H. P. M. 2017. "Discovery of Kolmogorov Scaling in the Natural Language" Entropy 19, no. 5: 198. https://doi.org/10.3390/e19050198