Publications

An Excerpt from Reviewed Papers Authored by Lab Members

  • Xin Du, Lixin Xiu, and Kumiko Tanaka-Ishii. Bottleneck-Minimal Indexing for Generative Document Retrieval. ICML 2024 (Oral). link arxiv code
  • Xin Du and Kumiko Tanaka-Ishii. Correlation Dimension of Natural Language in a Statistical Manifold. Physical Review Research, 6, L022028, 2024. link
  • Xin Du, Kai Moriyama, and Kumiko Tanaka-Ishii. Co-Training Realized Volatility Prediction Model with Neural Distributional Transformation. 4th ACM International Conference on AI in Finance (ICAIF’23), 2023, Oral. link
  • Kumiko Tanaka-Ishii and Akira Tanaka. Strahler number of Natural Language Sentences in Comparison with Random Trees. Journal of Statistical Mechanics, 2023, 1234034. Honorably selected for JSTAT Scientific Directors for the Highlights collection 2024. link link
  • Xin Du and Kumiko Tanaka-Ishii. Semantic field of words represented as nonlinear functions. Advances in Neural Information Processing Systems (NeurIPS), 2022. link
  • Xin Du and Kumiko Tanaka-Ishii. Stock portfolio selection balancing variance and tail risk via stock vector representation acquired from price data and texts. Knowledge Based Systems, 249, 2022. link
  • Kumiko Tanaka-Ishii and Shuntaro Takahashi. A comparison of two fluctuation analyses for natural language clustering phenomena: Taylor vs. Ebeling & Neiman methods. Fractals, 2, 2021. in press. link
  • Xin Du and Kumiko Tanaka-Ishii. Stock embeddings acquired from news articles and price history, and an application to portfolio optimization. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL) Long Papers, 2020. link
  • Shuntaro Takahashi and Kumiko Tanaka-Ishii. Evaluating computational language models with scaling properties of natural language. Computational Lingusitics, 45(3):1–33, 2019. link
  • Geng Ren, Shuntaro Takahashi, and Kumiko Tanaka-Ishii. Entropy rate estimation for english via a large cognitive experiment using mechanical turk. Entropy, 21(12), December 2019. link
  • Shuntaro Takahashi, Yu Chen, and Kumiko Tanaka-Ishii. Modeling financial time-series with generative adversarial networks. Physica A, 527(121261), 2019. link
  • Kumiko Tanaka-Ishii and Tatsuru Kobayashi . Taylor’s law for linguistic sequences and random walk models. Journal of Physics Communications,2018. link
  • Shuntaro Takahashi and Kumiko Tanaka-Ishii. Cross Entropy of Neural Language Models at Infinity—A New Bound of the Entropy Rate. Entropy, 2018. link
  • Tatsuru Kobayashi, Kumiko Tanaka-Ishii. Taylor’s Law of Human Linguistic Sequences. Proceedings of Annual Conference for Computational Linguistics (ACL), pages 1138–1148, 2018. link
  • Daiki Hirano, Kumiko Tanaka-Ishii, and Andrew Finch. Extraction of templates from phrases using Sequence Binary Decision Diagrams. Natural Language Engineering, Cambridge University Press, 24:1–33, 2018. link
  • Kumiko Tanaka-Ishii. Long-Range Correlation Underlying Childhood Language and Generative Models. frontiers in Psychology, section Quantitative Psychology and Measurement, 2018. link
  • Shuntaro Takahashi, Kumiko Tanaka-Ishii. Do Neural Nets Learn Statistical Laws behind Natural Language? PLoS One, 2017. link
  • Ryosuke Takahira, Kumiko Tanaka-Ishii, and Lukasz Debowski . Entropy Rate Estimates for Natural Language—A New Extrapolation of Compressed Large-Scale Corpora— Entropy, 2016. link
  • Kumiko Tanaka-Ishii and Armin Bunde. Long-Range Memory in Literary Texts: On the Universal Clustering of the Rare Words. PLoS One, 2016. link
  • Andrew Finch, Taisuke Harada, Kumiko Tanaka-Ishii, and Eiichiro Sumita. Inducing a Bilingual Lexicon from short Parallel Multiword Sequences. ACM Transactions on Asian Low-Resource Language Information Processing, 16(3), Article No.15. 2016.link
  • Kumiko Tanaka-Ishii and Shunsuke Aihara. Computational Constancy Measures of Texts-Yule’s K and Rényi’s Entropy. Computational Linguistics (CL), 41(3): 481–502, 2015. link
  • Andre Horie and Kumiko Tanaka-Ishii. Sentence Hedge Detection without Cue Annotation: A Heuristic Cue Selection Approach. 自然言語処理, 21(1):27-40, 2014. link
  • Andrew Michael Finch, Wei Song, Kumiko Tanaka-Ishii, Eiichiro Sumita. Speaking Louder Than Words with Pictures across Languages. AI magazine, 34(2):31-47, 2013.link
  • Wei Song, Andrew Michael Finch, Kumiko Tanaka-Ishii, Keiji Yasuda, Eiichiro Sumita. picoTrans: An Intelligent Icon-Driven Interface for Cross-Lingual Communication. ACM Transactions on Interactive Intelligent Systems (IIS) , 3(1):1–31,Article No.5 2013.link
  • Andre Kenji Horie, Kumiko Tanaka-Ishii, Mitsuru Ishizuka. Verb Temporality Analysis using Reichenbach’s Tense System. Proceedings of International Conference for Computational Linguistics (COLING), 471-482, 2012.link
  • Hiroshi Yamaguchi and Kumiko Tanaka-Ishii. Text Segmentation by Language Using Minimum Description Length. Proceedings of Annual Conference for Computational Linguistics (ACL), 969-978, 2012, July.link
  • Kumiko Tanaka-Ishii. Information Bias Inside English Words. Journal of Quantitative Linguistics, 19(1): 77-94, 2012.link
  • Wei Song, Andrew Michael Finch, Kumiko Tanaka-Ishii, Eiichiro Sumita. picoTrans: An Icon-driven User Interface for Machine Translation on Mobile Devices. Proceedings of International Conference on Intelligent User Interfaces, pages 23–32, 2011, Best Paper Award. link
  • 木村大翼, 田中久美子. 文書量に不変な定数-Yule のK, Golcher のVM 自然言語処理, 18(2):119–137, 2011, 論文賞受賞. link
  • Kumiko Tanaka-Ishii and Hiroshi Terada. Word familiarity and frequency. Studia Linguistica, April, 65(1):96-116, 2011.link
  • Kumiko Tanaka-Ishii, Satoshi Tezuka, and Hiroshi Terada. Sorting Texts by Readability. Computational Linguistics (CL), 36(2) 203–227, 2010.link
  • Kumiko Tanaka-Ishii and Julian Godon. Kansuke: A Logograph Look-Up Interface Based on a Few Modified Stroke Prototypes. ACM Transactions on Computer-Human Interaction (TOCHI) , 16(2) 1–17, 2009. link
  • Dani Yogatama and Kumiko Tanaka-Ishii. Multilingual Spectral Clustering Using Document Similarity Propagation. Proceedings of the Cnference on Empirical Methods in Natural Language Processing (EMNLP), pages 871–879, 2009. link
  • 江原遥, 田中久美子. TypeAny: 言語判別を用いた多言語入力システム. 自然言語処理, 15(5):151–167, 2008.link
  • Kumiko Tanaka-Ishii and Zhihui Jin. From phoneme to morpheme —another verification in English and Chinese using corpora—. Studia Linguistica, 62(2):224–248, July 2008. link
  • Yo Ehara and Kumiko Tanaka-Ishii. Multilingual Text Entry using Automatic Language Detection. Proceedings of International Joint Conference on Natural Languate Processing, 2008. pages 441-448 link
  • Lars Yencken, Zhihui Jin, and Kumiko Tanaka-Ishii. Pinyomi: Dictionary lookup via orthographic associations. Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, pages 13–21, 2007. link
  • Kumiko Tanaka-Ishii and Julian Godon. Kansuke: A Kanji Look-Up System Based on a Few Stroke Prototypes. In International Conference on the Computer Processing of Oriental Languages (TOCHI) , pages 310–320, 2006. link
  • Kumiko Tanaka-Ishii and Hiroshi Nakagawa. A Multilingual Usage Consultation Tool Based on Internet Searching-More than a Search Engine, Less than QA-. Proceedings of International World Wide Web Conference Comittee (IW3C2), pages 363–371, May 2005. Best presentation award. link
  • Kumiko Tanaka-Ishii, Michiko Abe, and Hiroshi Nakagawa. Categorization of movies using comments. Proceedings of International Conference of Pacific Association for Computational Linguistics , pages 221–229, 2003. Best paper award.
  • Kumiko Tanaka-Ishii, Daichi Hayakawa, and Masato Takeichi. Acquiring Vocabulary for Predictive Text Entry through Dynamic Reuse of a Small User Corpus. Proceedings of the Annual Meeting of Association for Computational Linguistics (ACL), pages 407–414, 2003. link
  • Kumiko Tanaka-Ishii, Yusuke Inutsuka, and Masato Takeichi. Entering Text with A Four-Button Device. In the 19th International Conference on Computational Linguistics (COLING), pages 988–994, 2002. link
  • Kumiko Tanaka-Ishii, Yusuke Inutsuka, and Masato Takeichi. Japanese Text Input System With Digits –Can Japanese text be estimated only from consonants?–. In Human Language Technology Conference (HLT), pages 211–218, 2001. link
  • Kumiko Tanaka-Ishii and Ian Frank. Multi-Agent Explanation Strategies in Real-Time Domains. Proceedings of Annual Meeting on Association for Computational Linguistics (ACL), pages 158–165, 2000. link
  • Kumiko Tanaka-Ishii, Koiti Hasida, and Itsuki Noda. Reactive Content Selection in the Generation of Real-time Soccer Commentary. Proceedings of International Conference on Computational Linguistics (COLING), pages 1282–1288, 1998. link
  • Kumiko Tanaka and Kyoji Umemura. Construction of a Bilingual Dictionary Intermediated by a Third Language. Proceedings of International Conference on Computational Linguistics (COLING), pages 297–303, 1994.link

Awards

  • 2021 第75回毎日出版文化賞 『言語とフラクタル』(The 75th Mainichi Publishing Culture Award)
  • 2013 Daniel Heffernanが修士研究の成果(Logue)で東大情報理工学系研究科長賞を受賞
  • 2012 Best Journal Paper Award of 2011 (木村大翼, 田中久美子. 文書量に不変な定数-Yule のK, Golcher のVM 自然言語処理
  • 2011 IUI2011 Best Paper Award (“picoTrans: An Icon-driven User Interface for Machine Translation on Mobile Devices”)
  • 2011 第19回 大川出版賞 『記号と再帰』 (19th Ohkawa Publication Award )
  • 2010 第32回 サントリー学芸賞 思想&歴史部門『記号と再帰』 (32nd Suntory Prize for Social Sciences and Humanities )
  • 2010 白木敦夫が2010年度 東大音声・言語・コミュニケーション研究会優秀発表賞(Twitterを利用した収集ソーシャルゲームの枠組みCollectter)
  • 2010 北川浩太郎が2010年度 東大音声・言語・コミュニケーション研究会優秀発表賞(木構造に基づく決定的係り受け解析)
  • 2010 木村大翼が2009年度言語処理学会年次大会若手奨励賞(文書長に依存しない文書定数)
  • 2009 粟飯原俊介が2009年度 東大音声・言語・コミュニケーション研究会優秀発表賞
  • 2009 2008年度言語処理学会年次大会優秀発表賞(単語に内在する情報量の偏在)
  • 2008 The Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology, The Young Scientists’ Prize (文部科学大臣表彰若手科学者賞)
  • 2007 言語処理学会年次大会優秀発表賞(外国人のための漢字検索インターフェース「漢輔」)
  • 2005 WWW Conference Best Presentation Award (“A Multilingual Usage Consultation Tool Based on Internet Searching -More than a Search Engine, Less than QA-“)
  • 2003 PACLING Best Paper Award (“Categorization of Movies using Comments”)
  • 2003 言語処理学会年次大会優秀発表賞(検索エンジンに基づく多言語用例指南ツールKIWI)
  • 1999 Scientic Challenge Award for the “development of automated and statistical game analysis systems and methodologies”.
  • 1998 Scientic Challenge Award for the “development of fully automatic commentator systems for RoboCup simulator league”
  • 1998 言語処理学会年次大会優秀発表賞(実時間サッカー自動実況システムMIKE)