Classification and curation of Listening Experiences with LLMs (Demo)
Description
This demo component was developed with the aim of supporting the identification of implicit themes (classification) and metadata (curation) in text. It takes as reference the documentary evidence benchmark
@book{tirado_2023, title={polifonia-project/meetups-knowledge-graph: v0.2}, url={https://zenodo.org/record/7924618}, DOI={10.5281/ZENODO.7924618}, abstractNote={The MEETUPS knowledge graph contains data about historical encounters of people in the musical world in Europe from c. 1800 to c. 1945.}, publisher={Zenodo}, author={Tirado, Alba Morales}, year={2023}, month={May} }
Tirado, A. M. (2023). polifonia-project/meetups-knowledge-graph: v0.2 (Version v0.2) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.7924618
Classification and curation of Listening Experiences (Demo)
This small study, undertaken as part of the wider CHILD pilot, focuses on harnessing LLM technology
to classify existing text extracts within LED, a task traditionally performed by human domain experts,
to address the challenges posed by the volume of textual data in fields such as music history.
Our experiment evaluates the effectiveness of an LLM in categorizing text extracts under the specific
theme of childhood, comparing its performance with that of a human domain expert. The comparison
aims to quantify the alignment between machine and human interpretations in textual analysis, look
at areas where LLM technology may show weaknesses and also investigate if there areas where LLMs
are able to shed new light on data that may go unnoticed by humans.
The software included here was developed with the aim of supporting the identification of
implicit themes in text and takes as reference the documentary evidence benchmark.
Interactions with the ChatGPT API (or other LLM) is currently handled in
the chatgpt.py file. Interactions with the LED knowledge graph are handled in led.py. In
order to run any of the scripts in this distribution, a copy of config.py.dist must be
made, called config.py, in which a valid OpenAI API key should be specified.
A summary of the experiements performed is provided in ‘output/CHILD_text_classification_with_LLM.pdf’
Results and analysis are provided in ‘output/ChatGPT-CHILD-Analysis.xlsx’