
Large language models (LLMs) such as ChatGPT are increasingly used in the cultural heritage domain for tasks like metadata creation, semantic enrichment, and artwork captioning. Since these tasks depend on curated metadata, it is essential to evaluate AI-generated descriptions and understand how human metadata improves them.
We present a method for automatically generating descriptions of ceramic and porcelain artworks in Digital Humanities (DH). Ceramics suit captioning tasks: they are three-dimensional and harder to describe than flat objects such as paintings, have simple forms that reveal basic generation quality, and represent widely used historical objects whose documentation is often incomplete.
Using an 11,566-item open-access Rijksmuseum dataset, we compare an LLM (ChatGPT) with our RAG-enhanced LLM, TerraLex, which retrieves similar artworks and uses their metadata. Our results show that the RAG method produces more accurate, context-rich descriptions with fewer errors, and human evaluators consistently preferred them. These findings highlight the advantage of RAG for description generation and the value of complete, high-quality human metadata. TerraLex supports humanities research by strengthening the completeness and clarity of object descriptions used in cataloguing, stylistic comparison, and interpretation. It also aids teaching and public engagement by providing accessible explanations of ceramic artifacts. The novelty of this work lies primarily in the proposed workflow and in the empirical benefits it demonstrates for metadata generation.
References
Kaoru Shimabayashi and Kumiko Tanaka-Ishii. Retrieval-Augmented Description Generation for Ceramic Artworks — Effectiveness of Knowledge-Enhancement by the Museum Metadata.