Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

Large-Scale System for Social Media Data Warehousing

OAI: oai:igi-global.com:290890 DOI: 10.4018/IJDWM.290890
Published by: IGI Global

Abstract

Social media data become an integral part in the business data and should be integrated into the decisional process for better decision making based on information which reflects better the true situation of business in any field. However, social media data are unstructured and generated in very high frequency which exceeds the capacity of the data warehouse. In this work, we propose to extend the data warehousing process with a staging area which heart is a large scale system implementing an information extraction process using Storm and Hadoop frameworks to better manage their volume and frequency. Concerning structured information extraction, mainly events, we combine a set of techniques from NLP, linguistic rules and machine learning to succeed the task. Finally, we propose the adequate data warehouse conceptual model for events modeling and integration with enterprise data warehouse using an intermediate table called Bridge table. For application and experiments, we focus on drug abuse events extraction from Twitter data and their modeling into the Event Data Warehouse.