Link Search Menu Expand Document

MEETUPS identification of people and places

MEETUPS identification of people and places is a tool developed using Python and Jupyter Notebook. This software uses DBpedia Spotlight to identify and annotate possible entity mentions from input text. This is an essential process to extract two of the three main elements that define a meetup: people (who participated), place (where) and time (when).

The implementation is divided into two Jupyter notebooks:

The first notebook is in charge of querying DBpedia Spotlight, processing the responses (JSON format) and store responses locally. It uses as input the corpus of music personalities generated by the MEETUPS cleaning component https://github.com/polifonia-project/meetups_pilot/blob/main/01_CleaningText.ipynb

  • Use DBpedia Spotlight to identify and annotate entity mentions from input text.
  • Retrieve and process JSON format responses from DBpedia Spotlight
  • Store responses for later processing.

The second notebook uses the responses from DBpedia Spotlight to capture data of people and places:

  • Uses the responses from DBpedia captured in the previous notebook
  • Search for two three types of entities: http://dbpedia.org/ontology/Person http://dbpedia.org/ontology/MusicalArtist http://dbpedia.org/ontology/Place
  • Classify the first two types as “people” and the last one as “place”
  • Store the results in extractedEntitiesPersonPlaceOnly/

Information on installation and setup

  • Jupyter Notebook: 02_queryDbpedia.ipynb 02_Identify_PP.ipynb

Details of the data

Code location:
|_ 02_queryDbpedia.ipynb
|_ 02_Identify_PP.ipynb

Index data location
Data input:
|_ indexedSentences/

DBpedia Spotlight annotations:
|_ cacheSpotlightResponse/        

People and places annotation
Data output:
|_ extractedEntitiesPersonPlaceOnly/


|_ README_people_places_identification.md

DOI:

TODO