Exploring the Potential of Social Media Content for Detecting Transport-Related Activities

Dmitry Pavlyuk, Maria Karatsoli, Eftihia Nathanail

In: Kabashkin I., Yatskiv I., Prentkovskis O. (eds) Reliability and Statistics in Transportation and Communication. RelStat 2018. Lecture Notes in Networks and Systems, vol 68. Springer, Cham
DOI: 10.1007/978-3-030-12450-2_13
: Text mining, Twitter, Big data, Classification models, Location-based data 

Available at Springer: REQUEST FULL TEXT Export citation: BibTeX RIS


The wide spread of social media encourages the users to share more often their activities as well as their location, leading to a rapid growth of the data volume. Current research retrieves this user-generated content on social media platforms in an effort to convert them into powerful tools, enabling transport related data collection. In this paper data from Twitter are retrieved and processed to explore their potential for providing transport related data. The main objective is to investigate the reliability of the transport related content retrieved from tweets and the transferability of analytics methods to other cities and languages. The research data set includes thousands of tweets collected in three cities: Minneapolis-Saint Paul twin cities (USA), Riga (Latvia), and Volos (Greece) in May–June 2018. Selection of the research areas is owed to substantially different environments in terms of population, language and transport infrastructure. The collected data were classified into five classes: general transport-related information, real-time information, complain, advice/question, unrelated to transport. Based on the obtained results, a cross comparison was made about efficiency of Twitter as a social media source of transport-related information in different urban environments.