The FGLOCTweet Corpus: An English tweet-based corpus for fine-grained location-detection tasks

Fernández-Martínez, Nicolás José

The FGLOCTweet Corpus: An English tweet-based corpus for fine-grained location-detection tasks

dc.contributor.author	Fernández-Martínez, Nicolás José
dc.date.accessioned	2025-01-30T15:47:52Z
dc.date.available	2025-01-30T15:47:52Z
dc.date.issued	2022-01
dc.description.abstract	Location detection in social-media microtexts is an important natural language processing task for emergency-based contexts where locative references are identified in text data. Spatial information obtained from texts is essential to understand where an incident happened, where people are in need of help and/or which areas have been affected. This information contributes to raising emergency situation awareness, which is then passed on to emergency responders and competent authorities to act as quickly as possible. Annotated text data are necessary for building and evaluating location-detection systems. The problem is that available corpora of tweets for location-detection tasks are either lacking or, at best, annotated with coarse-grained location types (e.g. cities, towns, countries, some buildings, etc.). To bridge this gap, we present our semi-automatically annotated corpus, the Fine-Grained LOCation Tweet Corpus (FGLOCTweet Corpus), an English tweet-based corpus for fine-grained location-detection tasks, including fine-grained locative references (i.e. geopolitical entities, natural landforms, points of interest and traffic ways) together with their surrounding locative markers (i.e. direction, distance, movement or time). It includes annotated tweet data for training and evaluation purposes, which can be used to advance research in location detection, as well as in the study of the linguistic representation of place or of the microtext genre of social media.	es_ES
dc.identifier.citation	Fernández-Martínez, N. J. (2022). The FGLOCTweet Corpus: An English tweet-based corpus for fine-grained location-detection tasks. Research in Corpus Linguistics, 10(1), 117–133. https://doi.org/10.32714/ricl.10.01.06	es_ES
dc.identifier.issn	22434712	es_ES
dc.identifier.other	10.32714/ricl.10.01.06	es_ES
dc.identifier.uri	https://hdl.handle.net/10953/4579
dc.language.iso	eng	es_ES
dc.publisher	Spanish Association for Corpus Linguistics	es_ES
dc.relation.ispartof	Research in Corpus Linguistics [2022]; [10(1)]; [117-133]	es_ES
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 España	*
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.subject	corpus for training and evaluating models	es_ES
dc.subject	fine-grained locations	es_ES
dc.subject	location detection	es_ES
dc.subject	locative references	es_ES
dc.subject	tweets	es_ES
dc.title	The FGLOCTweet Corpus: An English tweet-based corpus for fine-grained location-detection tasks	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: The FGLOCTweet Corpus An English tweet-based corpus for fine-grained location-detection tasks.pdf
Tamaño:: 320.57 KB
Formato:: Adobe Portable Document Format
Descripción:

Descargar