DH 2026. Retrieval-Augmented Description Generation for Ceramic Artworks— Effectiveness of Knowledge-Enhancement by the MuseumMetadata—

Large language models (LLMs) such as ChatGPT are increasingly used in the cultural heritage domain for tasks like metadata creation, semantic enrichment, and artwork captioning. Since these tasks depend on curated metadata, it is essential to evaluate AI-generated descriptions and understand how human metadata improves them.

We present a method for automatically generating descriptions of ceramic and porcelain artworks in Digital Humanities (DH). Ceramics suit captioning tasks: they are three-dimensional and harder to describe than flat objects such as paintings, have simple forms that reveal basic generation quality, and represent widely used historical objects whose documentation is often incomplete.

Using an 11,566-item open-access Rijksmuseum dataset, we compare an LLM (ChatGPT) with our RAG-enhanced LLM, TerraLex, which retrieves similar artworks and uses their metadata. Our results show that the RAG method produces more accurate, context-rich descriptions with fewer errors, and human evaluators consistently preferred them. These findings highlight the advantage of RAG for description generation and the value of complete, high-quality human metadata. TerraLex supports humanities research by strengthening the completeness and clarity of object descriptions used in cataloguing, stylistic comparison, and interpretation. It also aids teaching and public engagement by providing accessible explanations of ceramic artifacts. The novelty of this work lies primarily in the proposed workflow and in the empirical benefits it demonstrates for metadata generation.

References

Kaoru Shimabayashi and Kumiko Tanaka-Ishii. Retrieval-Augmented Description Generation for Ceramic Artworks — Effectiveness of Knowledge-Enhancement by the Museum Metadata. Accepted to Digital Humanities 2026 (DH2026), the annual international conference of the Association of Digital Humanities Organizations (ADHO), to appear in 2026. [link]

Categorized in:

Language Machine learning

References

Leave a Reply Cancel reply

Other Stories

ACL 2026. Repeated Sequences Reveal Gapsbetween Large Language Models and Natural Language

ICML 2026. Escaping Mode Collapse in LLM Generation via Geometric Regulation

TACL. Understanding Benchmark Language Under Weakened Formal Semantics

ACL 2026. Repeated Sequences Reveal Gapsbetween Large Language Models and Natural Language

ICML 2026. Escaping Mode Collapse in LLM Generation via Geometric Regulation

🏆ACL 2025 Outstanding Paper Award. New Formulation of Zipf’s Meaning-Frequency Law

AAAI 2025. Information-Theoretic Generative Clustering of Documents

JSTAT 2023. Strahler number of natural language sentences in comparison with random trees

Physical Review Research 2024. Correlation dimension of natural language in a statistical manifold

Knowledge-Based Systems 2022. Modeling of financial markets under extreme risks

TACL. Understanding Benchmark Language Under Weakened Formal Semantics

ACL 2026. Repeated Sequences Reveal Gapsbetween Large Language Models and Natural Language

ICML 2026. Escaping Mode Collapse in LLM Generation via Geometric Regulation

NeurIPS 2025. Correlation Dimension of Autoregressive Large Language Models

🏆ACL 2025 Outstanding Paper Award. New Formulation of Zipf’s Meaning-Frequency Law

ACL 2020. Stock Embeddings Acquired from News Articles and Price History, and an Application to Portfolio Optimization

ACM ICAIF 2023. Co-Training Realized Volatility Prediction Model with Neural Distributional Transformation

ACL 2020. Influence of textual data and communication structure on financial prices

Knowledge-Based Systems 2022. Modeling of financial markets under extreme risks

Press ESC to close

Or check our Popular Categories...

References

Leave a Reply Cancel reply

Related Articles

Other Stories

ACL 2026. Repeated Sequences Reveal Gapsbetween Large Language Models and Natural Language

ICML 2026. Escaping Mode Collapse in LLM Generation via Geometric Regulation