Link Search Menu Expand Document
View this file on Github Download as Schema.org

MEETUPS Corpus collection

DOI

Collecting Wikipedia pages of people in the music scene in Europe

Details of dataset

SPARQL queries to retrieve authors’ names and dbo:wikiPageID information using Dbpedia SPARQL Endpoint https://dbpedia.org/sparql

Query filters:

Categories: <http://dbpedia.org/resource/Category:Music_people>
            <http://dbpedia.org/resource/Category:People
Location:
            sparqlQueryResults/query.sparql
Query results"
            sparqlQueryResults/Q<1>_sparql.csv

Dataset:

Location:
            dataset/
Format:
            Text files .txt
Name convention:
            <Author_wikiPageID>.txt
Total biographies collected: 
            33,309 authors wikipedia webpage
Summary total biographies collected: 
            sparqlQueryResults/TOTAL_download_biography.csv
Meetups pilot sample: 1.002

Select random biographies -> sampleBiographies.py