Comparison of Image Generation Models for Abstract and Concrete Event Descriptions

dc.contributor.authorKhaliq, Mohammed
dc.contributor.authorFrassinelli, Diego
dc.contributor.authorSchulte Im Walde, Sabine
dc.date.accessioned2024-11-11T11:42:02Z
dc.date.available2024-11-11T11:42:02Z
dc.date.issued2024
dc.description.abstractWith the advent of diffusion-based image generation models such as DALL-E, Stable Diffusion and Midjourney, high quality images can be easily generated using textual inputs. It is unclear, however, to what extent the generated images resemble human mental representations, especially regarding abstract event knowledge. We analyse the capability of four state-of-the-art models in generating images of verb-object event pairs when we systematically manipulate the degrees of abstractness of both the verbs and the object nouns. Human judgements assess the generated images and demonstrate that DALL-E is strongest for event pairs with concrete nouns (e.g., “pour water”; “believe person”), while Midjourney is preferred for event pairs with abstract nouns (e.g., “raise awareness”; “remain mystery”), irrespective of the concreteness of the verb. Across models, humans were most unsatisfied with images of events pairs that combined concrete verbs with abstract direct-object nouns (e.g., “speak truth”), and an additional ad-hoc annotation contributes this to its potential for figurative language.
dc.description.versionpublisheddeu
dc.identifier.doi10.18653/v1/2024.figlang-1.3
dc.identifier.urihttps://kops.uni-konstanz.de/handle/123456789/71192
dc.language.isoeng
dc.subject.ddc400
dc.titleComparison of Image Generation Models for Abstract and Concrete Event Descriptionseng
dc.typeINPROCEEDINGS
dspace.entity.typePublication
kops.citation.bibtex
@inproceedings{Khaliq2024Compa-71192,
  year={2024},
  doi={10.18653/v1/2024.figlang-1.3},
  title={Comparison of Image Generation Models for Abstract and Concrete Event Descriptions},
  isbn={979-8-89176-110-0},
  publisher={Association for Computational Linguistics},
  address={Kerrville, TX},
  booktitle={Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024)},
  pages={15--21},
  editor={Ghosh, Debanjan and Muresan, Smaranda and Feldman, Anna},
  author={Khaliq, Mohammed and Frassinelli, Diego and Schulte Im Walde, Sabine}
}
kops.citation.iso690KHALIQ, Mohammed, Diego FRASSINELLI, Sabine SCHULTE IM WALDE, 2024. Comparison of Image Generation Models for Abstract and Concrete Event Descriptions. 4th Workshop on Figurative Language Processing (FigLang 2024). Mexico City, Mexico (Hybrid), 21. Juni 2024 - 21. Juni 2024. In: GHOSH, Debanjan, Hrsg., Smaranda MURESAN, Hrsg., Anna FELDMAN, Hrsg. und andere. Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024). Kerrville, TX: Association for Computational Linguistics, 2024, S. 15-21. ISBN 979-8-89176-110-0. Verfügbar unter: doi: 10.18653/v1/2024.figlang-1.3deu
kops.citation.iso690KHALIQ, Mohammed, Diego FRASSINELLI, Sabine SCHULTE IM WALDE, 2024. Comparison of Image Generation Models for Abstract and Concrete Event Descriptions. 4th Workshop on Figurative Language Processing (FigLang 2024). Mexico City, Mexico (Hybrid), Jun 21, 2024 - Jun 21, 2024. In: GHOSH, Debanjan, ed., Smaranda MURESAN, ed., Anna FELDMAN, ed. and others. Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024). Kerrville, TX: Association for Computational Linguistics, 2024, pp. 15-21. ISBN 979-8-89176-110-0. Available under: doi: 10.18653/v1/2024.figlang-1.3eng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71192">
    <dc:creator>Schulte Im Walde, Sabine</dc:creator>
    <dc:creator>Khaliq, Mohammed</dc:creator>
    <dc:contributor>Khaliq, Mohammed</dc:contributor>
    <dcterms:abstract>With the advent of diffusion-based image generation models such as DALL-E, Stable Diffusion and Midjourney, high quality images can be easily generated using textual inputs. It is unclear, however, to what extent the generated images resemble human mental representations, especially regarding abstract event knowledge. We analyse the capability of four state-of-the-art models in generating images of verb-object event pairs when we systematically manipulate the degrees of abstractness of both the verbs and the object nouns. Human judgements assess the generated images and demonstrate that DALL-E is strongest for event pairs with concrete nouns (e.g., “pour water”; “believe person”), while Midjourney is preferred for event pairs with abstract nouns (e.g., “raise awareness”; “remain mystery”), irrespective of the concreteness of the verb. Across models, humans were most unsatisfied with images of events pairs that combined concrete verbs with abstract direct-object nouns (e.g., “speak truth”), and an additional ad-hoc annotation contributes this to its potential for figurative language.</dcterms:abstract>
    <dc:contributor>Frassinelli, Diego</dc:contributor>
    <dcterms:issued>2024</dcterms:issued>
    <dc:contributor>Schulte Im Walde, Sabine</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-11-11T11:42:02Z</dc:date>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/71192"/>
    <dcterms:title>Comparison of Image Generation Models for Abstract and Concrete Event Descriptions</dcterms:title>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-11-11T11:42:02Z</dcterms:available>
    <dc:creator>Frassinelli, Diego</dc:creator>
    <dc:language>eng</dc:language>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
  </rdf:Description>
</rdf:RDF>
kops.conferencefield4th Workshop on Figurative Language Processing (FigLang 2024), 21. Juni 2024 - 21. Juni 2024, Mexico City, Mexico (Hybrid)deu
kops.date.conferenceEnd2024-06-21
kops.date.conferenceStart2024-06-21
kops.description.funding{"first":"dfg","second":"SCHU 2580/4-1"}
kops.flag.knbibliographytrue
kops.location.conferenceMexico City, Mexico (Hybrid)
kops.sourcefieldGHOSH, Debanjan, Hrsg., Smaranda MURESAN, Hrsg., Anna FELDMAN, Hrsg. und andere. <i>Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024)</i>. Kerrville, TX: Association for Computational Linguistics, 2024, S. 15-21. ISBN 979-8-89176-110-0. Verfügbar unter: doi: 10.18653/v1/2024.figlang-1.3deu
kops.sourcefield.plainGHOSH, Debanjan, Hrsg., Smaranda MURESAN, Hrsg., Anna FELDMAN, Hrsg. und andere. Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024). Kerrville, TX: Association for Computational Linguistics, 2024, S. 15-21. ISBN 979-8-89176-110-0. Verfügbar unter: doi: 10.18653/v1/2024.figlang-1.3deu
kops.sourcefield.plainGHOSH, Debanjan, ed., Smaranda MURESAN, ed., Anna FELDMAN, ed. and others. Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024). Kerrville, TX: Association for Computational Linguistics, 2024, pp. 15-21. ISBN 979-8-89176-110-0. Available under: doi: 10.18653/v1/2024.figlang-1.3eng
kops.title.conference4th Workshop on Figurative Language Processing (FigLang 2024)
relation.isAuthorOfPublicationbf0689a7-23f2-460a-8abb-42ea30bb2d29
relation.isAuthorOfPublication.latestForDiscoverybf0689a7-23f2-460a-8abb-42ea30bb2d29
source.bibliographicInfo.fromPage15
source.bibliographicInfo.toPage21
source.contributor.editorGhosh, Debanjan
source.contributor.editorMuresan, Smaranda
source.contributor.editorFeldman, Anna
source.flag.etalEditortrue
source.identifier.isbn979-8-89176-110-0
source.publisherAssociation for Computational Linguistics
source.publisher.locationKerrville, TX
source.titleProceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024)

Dateien