Skip navigation

The Twitter of Babel: Mapping world languages through microblogging platforms

The Twitter of Babel: Mapping world languages through microblogging platforms

Mocanu, Delia, Baronchelli, Andrea, Perra, Nicola ORCID: 0000-0002-5559-3064 , Goncalves, Bruno, Zhang, Qian and Vespignani, Alessandro (2013) The Twitter of Babel: Mapping world languages through microblogging platforms. PLoS ONE, 8 (4). e61981. ISSN 1932-6203 (Print), 1932-6203 (Online) (doi:https://doi.org/10.1371/journal.pone.0061981)

[img]
Preview
PDF (Publisher PDF)
14938_Perra_The Twitter of Babel (pub PDF OA) 2013.pdf - Published Version
Available under License Creative Commons Attribution.

Download (4MB) | Preview

Abstract

Large scale analysis and statistics of socio-technical systems that just a few short years ago would have required the use of consistent economic and human resources can nowadays be conveniently performed by mining the enormous amount of digital data produced by human activities. Although a characterization of several aspects of our societies is emerging from the data revolution, a number of questions concerning the reliability and the biases inherent to the big data “proxies” of social life are still open. Here, we survey worldwide linguistic indicators and trends through the analysis of a large-scale dataset of microblogging posts. We show that available data allow for the study of language geography at scales ranging from country-level aggregation to specific city neighborhoods. The high resolution and coverage of the data allows us to investigate different indicators such as the linguistic homogeneity of different countries, the touristic seasonal patterns within countries and the geographical distribution of different languages in multilingual regions. This work highlights the potential of geolocalized studies of open data sources to improve current analysis and develop indicators for major social phenomena in specific communities.

Item Type: Article
Additional Information: © 2013 Mocanu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Uncontrolled Keywords: Languages, D ata science
Faculty / School / Research Centre / Research Group: Faculty of Business > Networks and Urban Systems Centre (NUSC) > Centre for Business Network Analysis (CBNA)
Last Modified: 21 Oct 2020 10:05
URI: http://gala.gre.ac.uk/id/eprint/14938

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics