Methodological Problems with Transformation and Size Reduction of Data Sets in Network Analysis

Thumbnail Image
Date
2006
Authors
Marschall, Nicolas
Editors
Contact
Journal ISSN
Electronic ISSN
ISBN
Bibliographical data
Publisher
Series
DOI (citable link)
ArXiv-ID
International patent number
EU project number
Project
Open Access publication
Restricted until
Title in another language
Research Projects
Organizational Units
Journal Issue
Publication type
Diploma thesis
Publication status
Published in
Abstract
This thesis is a methodological study in the field of social network analysis. It seeks to investigate how certain factors can interfere with the processes of data collection and data analysis, and therefore lead to invalid or unreliable results for network-analytical measures. The discussion is focused on one-mode whole-network designs with data collection by the way of questionnaires. It begins with a short introduction into the methods of network analysis and then discusses the literature on the field of the validity of network analysis in general. Afterwards the possible influencing factors investigated in this study are discussed and the analysis is described.

In particular, nonresponse (biased and unbiased), forgetting, attempts of sampling, the omission of unimportant actors, symmetrization, dichotomization, and collapsing actors are investigated. These processes and methods are simulated by comparing the results of network-analytical measures calculated with an unchanged data set to the results of measures calculated with other variants of the same data set in which these processes have been simulated. The network-analytical measures tested are density, degree centralization, eigenvector centralization, the determination of cliques and k-plexes, degree centrality, closeness centrality, eigenvector centrality, and betweenness centrality. Density, centralization and the extent of size reduction are expected to be the main influencing factors for validity and reliability. All combinations of size reduction or transformation processes and network-analytical measures are simulated using a total of seven matrices representing different densities and centralizations. In some cases, different extents of size reduction and different strategies of dealing with the problem are investigated as well.

The study comes to the conclusion that size reduction and transformation processes can significantly change the results of an analysis. In most cases, the error introduced into the network-analytical measures is biased in one direction, most often negative. The deviations of the estimates from the real values depend on the extent of size reduction. Density and centralization are also influencing factors in many cases; however, the direction of this influence can change. Certain network-analytical measures like closeness centrality and the determination of subgroups are especially vulnerable. Certain size-reduction and transformation processes are more dangerous than others. These results are presented in detail at the end of the thesis.
Summary in another language
Diese Diplomarbeit ist eine methodologische Studie im Bereich der sozialen Netzwerkanalyse. Sie untersucht, wie bestimmte Faktoren den Prozess der Datenerhebung und Datenanalyse stören und die Validität und Reliabilität netzwerkanalytischer Untersuchungen gefährden. Der Fokus der Debatte liegt auf unimodalen Forschungsdesigns, die per Fragebogen erhobene Gesamtnetzwerke untersuchen. Die Diplomarbeit beginnt mit einer kurzen Darstellung der Methodik der Netzwerkanalyse und diskutiert dann die Literatur auf dem Feld der Validität und Reliabilität Sozialer Netzwerkanalyse. Danach werden werden die in dieser Studie untersuchten Einflussfaktoren diskutiert und die Methodik der Analyse beschrieben.

Im Speziellen werden Ausfälle und Vergessen bei der Befragung, Versuche der Stichprobenziehung, das Auslassen oder Entfernen unwichtiger Akteure, Symmetrisieren, Dichotomisieren und das Zusammenfassen von Akteuren untersucht. Diese Prozesse und Methoden werden simuliert, in dem Ergebnisse netzwerkanalytischer Analysen eines unveränderten Datensatzes verglichen werden mit Analysen von Varianten dieses Datensatzes, in denen diese Prozesse simuliert wurden. Die getesteten Analyseverfahren sind Dichte, Zentralisierung (degree und closeness), die Suche nach Cliquen und K-Plexen, sowie Zentralität (degree, closeness, Eigenvektor und betweenness). Dichte, Zentralisierung und das Ausmaß der Verkleinerung werden als wichtigste Einflussfaktoren für die Reliabilität und Validität betrachtet. Alle Kombinationen von Verkleinerungs- und Transformations-Prozessen werden mit insgesamt sieben Matrizen simuliert, die jeweils eine unterschiedliche Dichte und Zentralisierung aufweisen. In einigen Fällen werden auch verschiedene Ausmaße von Verkleinerung und verschiedene Problemlösungsstrategien untersucht.

Die Studie kommt zu dem Schluss dass Verkleinerungs- und Transformations-Prozesse die Ergebnisse netzwerkanalytischer Verfahren signifikant verändern können. In den meisten Fällen geht der Fehler systematisch in eine Richtung, in der Regel ins Negative. Dichte und Zentralisierung werden in vielen Fällen ebenfalls als Einflussfaktoren nachgewiesen, allerdings je nach Maß und Methode in unterschiedlicher Richtung. Bestimmte netzwerkanalytische Verfahren wie Closeness Zentralität und die Suche nach Teilgruppen sind besonders empfindlich. Bestimmte Verkleinerungs- und Transformations-Prozesse sind gefährlicher als andere.
Subject (DDC)
300 Social Sciences, Sociology
Keywords
Methodenkritik,Network Analysis,Social Network Analysis,Methodology,Network Sampling,Validity
Conference
Review
undefined / . - undefined, undefined. - (undefined; undefined)
Cite This
ISO 690MARSCHALL, Nicolas, 2006. Methodological Problems with Transformation and Size Reduction of Data Sets in Network Analysis [Master thesis]
BibTex
@mastersthesis{Marschall2006Metho-4198,
  year={2006},
  title={Methodological Problems with Transformation and Size Reduction of Data Sets in Network Analysis},
  author={Marschall, Nicolas}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/4198">
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/4198/1/network_analysis.pdf"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T10:13:03Z</dc:date>
    <dc:language>eng</dc:language>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T10:13:03Z</dcterms:available>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/4198/1/network_analysis.pdf"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/42"/>
    <dcterms:issued>2006</dcterms:issued>
    <dcterms:title>Methodological Problems with Transformation and Size Reduction of Data Sets in Network Analysis</dcterms:title>
    <dc:rights>Attribution-NonCommercial-NoDerivs 2.0 Generic</dc:rights>
    <dc:contributor>Marschall, Nicolas</dc:contributor>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/4198"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Marschall, Nicolas</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/42"/>
    <dc:format>application/pdf</dc:format>
    <dcterms:abstract xml:lang="eng">This thesis is a methodological study in the field of social network analysis. It seeks to investigate how certain factors can interfere with the processes of data collection and data analysis, and therefore lead to invalid or unreliable results for network-analytical measures. The discussion is focused on one-mode whole-network designs with data collection by the way of questionnaires. It begins with a short introduction into the methods of network analysis and then discusses the literature on the field of the validity of network analysis in general. Afterwards the possible influencing factors investigated in this study are discussed and the analysis is described.&lt;br /&gt;&lt;br /&gt;In particular, nonresponse (biased and unbiased), forgetting, attempts of sampling, the omission of unimportant actors, symmetrization, dichotomization, and collapsing actors are investigated. These processes and methods are simulated by comparing the results of network-analytical measures calculated with an unchanged data set to the results of measures calculated with other variants of the same data set in which these processes have been simulated. The network-analytical measures tested are density, degree centralization, eigenvector centralization, the determination of cliques and k-plexes, degree centrality, closeness centrality, eigenvector centrality, and betweenness centrality. Density, centralization and the extent of size reduction are  expected to be the main influencing factors for validity and reliability. All combinations of size reduction or transformation processes and network-analytical measures are simulated using a total of seven matrices representing different densities and centralizations. In some cases, different extents of size reduction and different strategies of dealing with the problem are investigated as well.&lt;br /&gt;&lt;br /&gt;The study comes to the conclusion that size reduction and transformation processes can significantly change the results of an analysis. In most cases, the error introduced into the network-analytical measures is biased in one direction, most often negative. The deviations of the estimates from the real values depend on the extent of size reduction. Density and centralization are also influencing factors in many cases; however, the direction of this influence can change. Certain network-analytical measures like closeness centrality and the determination of subgroups are especially vulnerable. Certain size-reduction and transformation processes are more dangerous than others. These results are presented in detail at the end of the thesis.</dcterms:abstract>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/2.0/"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
  </rdf:Description>
</rdf:RDF>
Internal note
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Contact
URL of original publication
Test date of URL
Examination date of dissertation
Method of financing
Comment on publication
Alliance license
Corresponding Authors der Uni Konstanz vorhanden
International Co-Authors
Bibliography of Konstanz
Refereed