Create a relevant corpus
Persona
Sethus is a music theorist, composer and teacher, specialized in late Renaissance music. In this field, his research focuses on the description and formalization of the compositional processes at work in modal polyphony. His reflection is based on a close reading of the scientific literature (historical and contemporary) on which he builds a synthesis oriented towards compositional practice.
Goal
Sethus prepares a book on a specific aspect of compositional practices during the Renaissance period, and aims at finding a set of relevant references to sources to illustrate his text. Each concept addressed in the book (e.g., modal cadences, imitations, reverse imitations, structure of pieces, modes, etc.) should be illustrated by a list of fragments taken from pieces of various authors, various periodes, various compositional styles, etc. Ideally, a same piece would serve as a source to illustrate several concepts. Fragments should be easy to refer to, annotated, quotable, and interlinked to highlight their relationship, differences / resemblances (ie the link itself should be annotated to express its purpose).
Sethus can access a large corpus of music resource, including music scores encoded in MEI. However most of which are either irrelevant, or not suited (lack of representative fragments, or outliers far from the canonical occurrences which are searched for). His goal is therefore to extract or organize according to his need a subset of the corpus that serves as a source for the book illustration.
Scenario
Sethus uses a search tool that allows to explore the corpus at hand based on several criteria. An initial search can be done based on melodic profile (say, a typical cadence in a soprano line). The first result set is then automatically subdivided by the system based on secondary criteria (period, mode, composer for instance). Sethus can fix this initial organization and refine the search on each part of the hierarchy thereby created, and reinitialize the process by choosing another initial criteria. Along the way, identified fragments can be annotated, and the system keeps track of the search pattern that characterizes a level in the result set hirarchy. Additional tools can be used to further reinforce the structure of the result set, leading at a final Relevant Corpus with a rich internal structure and annotations, whose definition (the set queries and the hierarchy of facets) can be stored and later reconsidered.
Competency questions
CQ1: Search operations should be expressed on either raw-level features (eg melodic lines) or higher-level features (eg modes, cadences, triad patterns, etc.). The systm should retrieve efficiently all sources that match the search pattern, and identifiy pattern occurrences
CQ2: A subset can be hierarchically refined based on criteria independant from on another and independant from the initial search pattern. The user can navigate in this hierarchy and refine the search.
CQ3: At each step, the result set takes the form of a subcorpus organized according to rich relationships, and enriched with user annotations
CQ4: A result set can be stored (in intention, not in extension) as a Relevant Corpus, and its definition can be revised at any moment
CQ5: navigating through a Relevant Corpus is based on anotation and search patterns
Resources
Bibliographie