Clustering is a fundamental technique in machine learning and data mining, offering a powerful lens to understand self-organizing patterns in the real world. At its…
Language
Documents have complexity from various perspectives, such as compression rate and the degree of fluctuation. The complexity varies depending on the extent to which the…
We proposed a stock vector representation called “stock embedding,” obtained using a deep learning framework that utilizes news articles and stock price history. This embedding…
The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications. This article proposes computation of the Strahler…
We apply an information-theoretic perspective to reconsider generative document retrieval (GDR), in which a document x∈X is indexed by t∈T, and a neural autoregressive model is…
The correlation dimension of natural language is measured by applying the Grassberger-Procaccia algorithm to high-dimensional sequences produced by a large-scale language model. This method, previously…
We explore the complexity underlying human symbolic sequences via entropy rate estimation. Consider the number of possibilities for a time series of length n, with a…
The bitcoin price crash at the beginning of 2018 was caused by various social factors. The influence of news wire stories and social media was…
The theory of econophysics reveals the scaling properties of price, which explains why market crashes much more frequently than expected. A challenge of financial market…
Real instances of social systems have a bursty character, meaning that events occur in a clustered manner. For example, the figure below shows how rare…