Publikation:

Design and Evaluation of Event Detection Techniques for Social Media Data Streams

Lade...
Vorschaubild

Dateien

Weiler_0-330045.pdf
Weiler_0-330045.pdfGröße: 37.33 MBDownloads: 598

Datum

2016

Autor:innen

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Dissertation
Publikationsstatus
Published

Erschienen in

Zusammenfassung

Unprecedented success and active usage of social media services result in massive amounts of user-generated data. A leading player in producing a large volume of data as a continuous stream of short messages, so-called tweets, is the social network platform Twitter. The brevity of tweets, with a maximum of 140 characters, makes them an ideal mobile communication medium. Therefore, Twitters popularity as a source of up-to-date news and information source for current events is constantly increasing. In response to this trend, numerous research works on event detection techniques applied to the Twitter data stream have been proposed. However, most of these works suffer from two major shortcomings. First, they tend to focus exclusively on the information extraction aspect and often ignore the streaming nature of the input. Second, although all of the proposed works provide some evidence as to the quality of the detected events, none relate this task-based performance to their run-time performance in terms of processing speed or data throughput. In particular, neither a quantitative nor a comparative evaluation of these aspects has been performed to date. This thesis mainly describes our research work to fill these gaps and to tackle the posed challenges. In the first part of the thesis, we present a technique for real-time event detection and tracking, which focuses on the streaming nature of the data. Additionally, we describe a technique for event detection in pre-defined geographic areas. In the second part of the thesis, we study the run-time and task-based performance of several state-of-the-art event detection as well as baseline techniques using real-world Twitter streaming data. In order to reproducibly compare run-time performance, our approach is based on a general-purpose data stream management system, whereas task-based performance is automatically assessed based on a set of novel measures. This set of measures is especially designed to support the quantitative and qualitative comparison of event detection techniques. The last part of the thesis describes the design and evaluation of two visualizations to support visual event detection. First, we present “Stor-e-Motion”, a shape-based visualization to track the ongoing evolution of importance, emotion, and story of topics in user-defined topic channels applied to the Twitter data stream. Second, we present “SiCi Explorer”, a visualization that supports analysts in monitoring events/topics and emotions both in time and in space. The visualization uses a clock-face metaphor to encode temporal and spatial relationships, a color map to reflect emotion, and tag clouds to show the events and topics. Finally, we demonstrate the usefulness and usability of the visualization in a user study that we conducted.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

event detection, twitter, data streams, evaluation

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690WEILER, Andreas, 2016. Design and Evaluation of Event Detection Techniques for Social Media Data Streams [Dissertation]. Konstanz: University of Konstanz
BibTex
@phdthesis{Weiler2016Desig-33720,
  year={2016},
  title={Design and Evaluation of Event Detection Techniques for Social Media Data Streams},
  author={Weiler, Andreas},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/33720">
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/33720/3/Weiler_0-330045.pdf"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:title>Design and Evaluation of Event Detection Techniques for Social Media Data Streams</dcterms:title>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/33720"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/33720/3/Weiler_0-330045.pdf"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract xml:lang="eng">Unprecedented success and active usage of social media services result in massive amounts of user-generated data. A leading player in producing a large volume of data as a continuous stream of short messages, so-called tweets, is the social network platform Twitter. The brevity of tweets, with a maximum of 140 characters, makes them an ideal mobile communication medium. Therefore, Twitters popularity as a source of up-to-date news and information source for current events is constantly increasing. In response to this trend, numerous research works on event detection techniques applied to the Twitter data stream have been proposed. However, most of these works suffer from two major shortcomings. First, they tend to focus exclusively on the information extraction aspect and often ignore the streaming nature of the input. Second, although all of the proposed works provide some evidence as to the quality of the detected events, none relate this task-based performance to their run-time performance in terms of processing speed or data throughput. In particular, neither a quantitative nor a comparative evaluation of these aspects has been performed to date. This thesis mainly describes our research work to fill these gaps and to tackle the posed challenges. In the first part of the thesis, we present a technique for real-time event detection and tracking, which focuses on the streaming nature of the data. Additionally, we describe a technique for event detection in pre-defined geographic areas. In the second part of the thesis, we study the run-time and task-based performance of several state-of-the-art event detection as well as baseline techniques using real-world Twitter streaming data. In order to reproducibly compare run-time performance, our approach is based on a general-purpose data stream management system, whereas task-based performance is automatically assessed based on a set of novel measures. This set of measures is especially designed to support the quantitative and qualitative comparison of event detection techniques. The last part of the thesis describes the design and evaluation of two visualizations to support visual event detection. First, we present “Stor-e-Motion”, a shape-based visualization to track the ongoing evolution of importance, emotion, and story of topics in user-defined topic channels applied to the Twitter data stream. Second, we present “SiCi Explorer”, a visualization that supports analysts in monitoring events/topics and emotions both in time and in space. The visualization uses a clock-face metaphor to encode temporal and spatial relationships, a color map to reflect emotion, and tag clouds to show the events and topics. Finally, we demonstrate the usefulness and usability of the visualization in a user study that we conducted.</dcterms:abstract>
    <dc:rights>terms-of-use</dc:rights>
    <dc:contributor>Weiler, Andreas</dc:contributor>
    <dcterms:issued>2016</dcterms:issued>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:language>eng</dc:language>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2016-04-27T13:26:11Z</dc:date>
    <dc:creator>Weiler, Andreas</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2016-04-27T13:26:11Z</dcterms:available>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

March 24, 2016
Hochschulschriftenvermerk
Konstanz, Univ., Diss., 2016
Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen