Loading…
Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing
Online media play an important role in public health emergencies and serve as essential communication platforms. Infoveillance of online media during the COVID-19 pandemic is an important step toward gaining a better understanding of crisis communication. The goal of this study was to perform a long...
Saved in:
Published in: | JMIR public health and surveillance 2021-12, Vol.7 (12), p.e31540 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3 |
---|---|
cites | cdi_FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3 |
container_end_page | |
container_issue | 12 |
container_start_page | e31540 |
container_title | JMIR public health and surveillance |
container_volume | 7 |
creator | Beliga, Slobodan Martinčić-Ipšić, Sanda Matešić, Mihaela Petrijevčanin Vuksanović, Irena Meštrović, Ana |
description | Online media play an important role in public health emergencies and serve as essential communication platforms. Infoveillance of online media during the COVID-19 pandemic is an important step toward gaining a better understanding of crisis communication.
The goal of this study was to perform a longitudinal analysis of the COVID-19-related content on online media based on natural language processing.
We collected a data set of news articles published by Croatian online media during the first 13 months of the pandemic. First, we tested the correlations between the number of articles and the number of new daily COVID-19 cases. Second, we analyzed the content by extracting the most frequent terms and applied the Jaccard similarity coefficient. Third, we compared the occurrence of the pandemic-related terms during the two waves of the pandemic. Finally, we applied named entity recognition to extract the most frequent entities and tracked the dynamics of changes during the observation period.
The results showed no significant correlation between the number of articles and the number of new daily COVID-19 cases. Furthermore, there were high overlaps in the terminology used in all articles published during the pandemic with a slight shift in the pandemic-related terms between the first and the second waves. Finally, the findings indicate that the most influential entities have lower overlaps for the identified people and higher overlaps for locations and institutions.
Our study shows that online media have a prompt response to the pandemic with a large number of COVID-19-related articles. There was a high overlap in the frequently used terms across the first 13 months, which may indicate the narrow focus of reporting in certain periods. However, the pandemic-related terminology is well-covered. |
doi_str_mv | 10.2196/31540 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_eec3a7866a894b2fac6dfb0c1bda3d30</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_eec3a7866a894b2fac6dfb0c1bda3d30</doaj_id><sourcerecordid>2594291560</sourcerecordid><originalsourceid>FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3</originalsourceid><addsrcrecordid>eNpdkl1rFDEUhgdRbKn7FyQggjej-ZpM4oUgWz8WVregFbwKZ_IxzTKbtJmZQv-Av9tst5bWqxzOefKec5K3qhYEv6VEiXeMNBw_qY4pE6qmSuCnD-KjajGOW4wxEZIxqZ5XR4y3TDEpj6s_q-jTtQvDANE4lDyaLhxa5gRTgIg2cQjRoW_OBkCncw6xPwCbX6vTmih0BtG6XTDvC-rq3w4yWqfYh2m2IcKAfpTgBp2P-4vfYZpzya0h9jP0Dp3lZNy4r72onnkYRre4O0-q88-ffi6_1uvNl9Xy47o2vGmnmsmOcQoeWs8pdqIVsuGeYN8SYxvFOqNkK7gkDRijKAbS8Y55SzBpgWLDTqrVQdcm2OrLHHaQb3SCoG8TKfca8hTM4LRzhkErhQCpeEc9GGF9hw3pLDDLcNH6cNC6nLuds8bFqWz3SPRxJYYL3adrLVvSKMmLwJs7gZyuZjdOehdG4_Zf4dI8atooThVpxL7Xq__QbZpzeeBCCdI0VJS9C_X6QJmcxjE7fz8MwXpvFH1rlMK9fDj5PfXPFuwvNn63kA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2615526876</pqid></control><display><type>article</type><title>Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing</title><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><source>Coronavirus Research Database</source><creator>Beliga, Slobodan ; Martinčić-Ipšić, Sanda ; Matešić, Mihaela ; Petrijevčanin Vuksanović, Irena ; Meštrović, Ana</creator><creatorcontrib>Beliga, Slobodan ; Martinčić-Ipšić, Sanda ; Matešić, Mihaela ; Petrijevčanin Vuksanović, Irena ; Meštrović, Ana</creatorcontrib><description>Online media play an important role in public health emergencies and serve as essential communication platforms. Infoveillance of online media during the COVID-19 pandemic is an important step toward gaining a better understanding of crisis communication.
The goal of this study was to perform a longitudinal analysis of the COVID-19-related content on online media based on natural language processing.
We collected a data set of news articles published by Croatian online media during the first 13 months of the pandemic. First, we tested the correlations between the number of articles and the number of new daily COVID-19 cases. Second, we analyzed the content by extracting the most frequent terms and applied the Jaccard similarity coefficient. Third, we compared the occurrence of the pandemic-related terms during the two waves of the pandemic. Finally, we applied named entity recognition to extract the most frequent entities and tracked the dynamics of changes during the observation period.
The results showed no significant correlation between the number of articles and the number of new daily COVID-19 cases. Furthermore, there were high overlaps in the terminology used in all articles published during the pandemic with a slight shift in the pandemic-related terms between the first and the second waves. Finally, the findings indicate that the most influential entities have lower overlaps for the identified people and higher overlaps for locations and institutions.
Our study shows that online media have a prompt response to the pandemic with a large number of COVID-19-related articles. There was a high overlap in the frequently used terms across the first 13 months, which may indicate the narrow focus of reporting in certain periods. However, the pandemic-related terminology is well-covered.</description><identifier>ISSN: 2369-2960</identifier><identifier>EISSN: 2369-2960</identifier><identifier>DOI: 10.2196/31540</identifier><identifier>PMID: 34739388</identifier><language>eng</language><publisher>Canada: JMIR Publications</publisher><subject>Communication ; Coronaviruses ; COVID-19 ; Datasets ; Disease transmission ; Epidemics ; Health surveillance ; Humans ; Influenza ; Infodemiology ; Information sources ; Longitudinal Studies ; Media coverage ; Medical research ; Multimedia ; Natural Language Processing ; News media ; Original Paper ; Pandemics ; Public health ; SARS-CoV-2 ; Sentiment analysis ; Severe acute respiratory syndrome coronavirus 2 ; Social Media ; Statistical methods ; Terminology</subject><ispartof>JMIR public health and surveillance, 2021-12, Vol.7 (12), p.e31540</ispartof><rights>Slobodan Beliga, Sanda Martinčić-Ipšić, Mihaela Matešić, Irena Petrijevčanin Vuksanović, Ana Meštrović. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 24.12.2021.</rights><rights>2021. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Slobodan Beliga, Sanda Martinčić-Ipšić, Mihaela Matešić, Irena Petrijevčanin Vuksanović, Ana Meštrović. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 24.12.2021. 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3</citedby><cites>FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3</cites><orcidid>0000-0002-1900-5333 ; 0000-0001-9513-9467 ; 0000-0003-1407-6156 ; 0000-0002-3793-6852 ; 0000-0002-4780-8512</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2615526876?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2615526876?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25731,27901,27902,36989,36990,38493,43871,44566,53766,53768,74155,74869</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34739388$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Beliga, Slobodan</creatorcontrib><creatorcontrib>Martinčić-Ipšić, Sanda</creatorcontrib><creatorcontrib>Matešić, Mihaela</creatorcontrib><creatorcontrib>Petrijevčanin Vuksanović, Irena</creatorcontrib><creatorcontrib>Meštrović, Ana</creatorcontrib><title>Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing</title><title>JMIR public health and surveillance</title><addtitle>JMIR Public Health Surveill</addtitle><description>Online media play an important role in public health emergencies and serve as essential communication platforms. Infoveillance of online media during the COVID-19 pandemic is an important step toward gaining a better understanding of crisis communication.
The goal of this study was to perform a longitudinal analysis of the COVID-19-related content on online media based on natural language processing.
We collected a data set of news articles published by Croatian online media during the first 13 months of the pandemic. First, we tested the correlations between the number of articles and the number of new daily COVID-19 cases. Second, we analyzed the content by extracting the most frequent terms and applied the Jaccard similarity coefficient. Third, we compared the occurrence of the pandemic-related terms during the two waves of the pandemic. Finally, we applied named entity recognition to extract the most frequent entities and tracked the dynamics of changes during the observation period.
The results showed no significant correlation between the number of articles and the number of new daily COVID-19 cases. Furthermore, there were high overlaps in the terminology used in all articles published during the pandemic with a slight shift in the pandemic-related terms between the first and the second waves. Finally, the findings indicate that the most influential entities have lower overlaps for the identified people and higher overlaps for locations and institutions.
Our study shows that online media have a prompt response to the pandemic with a large number of COVID-19-related articles. There was a high overlap in the frequently used terms across the first 13 months, which may indicate the narrow focus of reporting in certain periods. However, the pandemic-related terminology is well-covered.</description><subject>Communication</subject><subject>Coronaviruses</subject><subject>COVID-19</subject><subject>Datasets</subject><subject>Disease transmission</subject><subject>Epidemics</subject><subject>Health surveillance</subject><subject>Humans</subject><subject>Influenza</subject><subject>Infodemiology</subject><subject>Information sources</subject><subject>Longitudinal Studies</subject><subject>Media coverage</subject><subject>Medical research</subject><subject>Multimedia</subject><subject>Natural Language Processing</subject><subject>News media</subject><subject>Original Paper</subject><subject>Pandemics</subject><subject>Public health</subject><subject>SARS-CoV-2</subject><subject>Sentiment analysis</subject><subject>Severe acute respiratory syndrome coronavirus 2</subject><subject>Social Media</subject><subject>Statistical methods</subject><subject>Terminology</subject><issn>2369-2960</issn><issn>2369-2960</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>COVID</sourceid><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpdkl1rFDEUhgdRbKn7FyQggjej-ZpM4oUgWz8WVregFbwKZ_IxzTKbtJmZQv-Av9tst5bWqxzOefKec5K3qhYEv6VEiXeMNBw_qY4pE6qmSuCnD-KjajGOW4wxEZIxqZ5XR4y3TDEpj6s_q-jTtQvDANE4lDyaLhxa5gRTgIg2cQjRoW_OBkCncw6xPwCbX6vTmih0BtG6XTDvC-rq3w4yWqfYh2m2IcKAfpTgBp2P-4vfYZpzya0h9jP0Dp3lZNy4r72onnkYRre4O0-q88-ffi6_1uvNl9Xy47o2vGmnmsmOcQoeWs8pdqIVsuGeYN8SYxvFOqNkK7gkDRijKAbS8Y55SzBpgWLDTqrVQdcm2OrLHHaQb3SCoG8TKfca8hTM4LRzhkErhQCpeEc9GGF9hw3pLDDLcNH6cNC6nLuds8bFqWz3SPRxJYYL3adrLVvSKMmLwJs7gZyuZjdOehdG4_Zf4dI8atooThVpxL7Xq__QbZpzeeBCCdI0VJS9C_X6QJmcxjE7fz8MwXpvFH1rlMK9fDj5PfXPFuwvNn63kA</recordid><startdate>20211224</startdate><enddate>20211224</enddate><creator>Beliga, Slobodan</creator><creator>Martinčić-Ipšić, Sanda</creator><creator>Matešić, Mihaela</creator><creator>Petrijevčanin Vuksanović, Irena</creator><creator>Meštrović, Ana</creator><general>JMIR Publications</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7RV</scope><scope>7X7</scope><scope>7XB</scope><scope>8C1</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>K9.</scope><scope>KB0</scope><scope>M0S</scope><scope>NAPCQ</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-1900-5333</orcidid><orcidid>https://orcid.org/0000-0001-9513-9467</orcidid><orcidid>https://orcid.org/0000-0003-1407-6156</orcidid><orcidid>https://orcid.org/0000-0002-3793-6852</orcidid><orcidid>https://orcid.org/0000-0002-4780-8512</orcidid></search><sort><creationdate>20211224</creationdate><title>Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing</title><author>Beliga, Slobodan ; Martinčić-Ipšić, Sanda ; Matešić, Mihaela ; Petrijevčanin Vuksanović, Irena ; Meštrović, Ana</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Communication</topic><topic>Coronaviruses</topic><topic>COVID-19</topic><topic>Datasets</topic><topic>Disease transmission</topic><topic>Epidemics</topic><topic>Health surveillance</topic><topic>Humans</topic><topic>Influenza</topic><topic>Infodemiology</topic><topic>Information sources</topic><topic>Longitudinal Studies</topic><topic>Media coverage</topic><topic>Medical research</topic><topic>Multimedia</topic><topic>Natural Language Processing</topic><topic>News media</topic><topic>Original Paper</topic><topic>Pandemics</topic><topic>Public health</topic><topic>SARS-CoV-2</topic><topic>Sentiment analysis</topic><topic>Severe acute respiratory syndrome coronavirus 2</topic><topic>Social Media</topic><topic>Statistical methods</topic><topic>Terminology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Beliga, Slobodan</creatorcontrib><creatorcontrib>Martinčić-Ipšić, Sanda</creatorcontrib><creatorcontrib>Matešić, Mihaela</creatorcontrib><creatorcontrib>Petrijevčanin Vuksanović, Irena</creatorcontrib><creatorcontrib>Meštrović, Ana</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Nursing and Allied Health Journals</collection><collection>ProQuest Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Public Health Database (Proquest)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Database (Alumni Edition)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Nursing & Allied Health Premium</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>JMIR public health and surveillance</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Beliga, Slobodan</au><au>Martinčić-Ipšić, Sanda</au><au>Matešić, Mihaela</au><au>Petrijevčanin Vuksanović, Irena</au><au>Meštrović, Ana</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing</atitle><jtitle>JMIR public health and surveillance</jtitle><addtitle>JMIR Public Health Surveill</addtitle><date>2021-12-24</date><risdate>2021</risdate><volume>7</volume><issue>12</issue><spage>e31540</spage><pages>e31540-</pages><issn>2369-2960</issn><eissn>2369-2960</eissn><abstract>Online media play an important role in public health emergencies and serve as essential communication platforms. Infoveillance of online media during the COVID-19 pandemic is an important step toward gaining a better understanding of crisis communication.
The goal of this study was to perform a longitudinal analysis of the COVID-19-related content on online media based on natural language processing.
We collected a data set of news articles published by Croatian online media during the first 13 months of the pandemic. First, we tested the correlations between the number of articles and the number of new daily COVID-19 cases. Second, we analyzed the content by extracting the most frequent terms and applied the Jaccard similarity coefficient. Third, we compared the occurrence of the pandemic-related terms during the two waves of the pandemic. Finally, we applied named entity recognition to extract the most frequent entities and tracked the dynamics of changes during the observation period.
The results showed no significant correlation between the number of articles and the number of new daily COVID-19 cases. Furthermore, there were high overlaps in the terminology used in all articles published during the pandemic with a slight shift in the pandemic-related terms between the first and the second waves. Finally, the findings indicate that the most influential entities have lower overlaps for the identified people and higher overlaps for locations and institutions.
Our study shows that online media have a prompt response to the pandemic with a large number of COVID-19-related articles. There was a high overlap in the frequently used terms across the first 13 months, which may indicate the narrow focus of reporting in certain periods. However, the pandemic-related terminology is well-covered.</abstract><cop>Canada</cop><pub>JMIR Publications</pub><pmid>34739388</pmid><doi>10.2196/31540</doi><orcidid>https://orcid.org/0000-0002-1900-5333</orcidid><orcidid>https://orcid.org/0000-0001-9513-9467</orcidid><orcidid>https://orcid.org/0000-0003-1407-6156</orcidid><orcidid>https://orcid.org/0000-0002-3793-6852</orcidid><orcidid>https://orcid.org/0000-0002-4780-8512</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2369-2960 |
ispartof | JMIR public health and surveillance, 2021-12, Vol.7 (12), p.e31540 |
issn | 2369-2960 2369-2960 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_eec3a7866a894b2fac6dfb0c1bda3d30 |
source | Publicly Available Content (ProQuest); PubMed Central; Coronavirus Research Database |
subjects | Communication Coronaviruses COVID-19 Datasets Disease transmission Epidemics Health surveillance Humans Influenza Infodemiology Information sources Longitudinal Studies Media coverage Medical research Multimedia Natural Language Processing News media Original Paper Pandemics Public health SARS-CoV-2 Sentiment analysis Severe acute respiratory syndrome coronavirus 2 Social Media Statistical methods Terminology |
title | Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T16%3A37%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Infoveillance%20of%20the%20Croatian%20Online%20Media%20During%20the%20COVID-19%20Pandemic:%20One-Year%20Longitudinal%20Study%20Using%20Natural%20Language%20Processing&rft.jtitle=JMIR%20public%20health%20and%20surveillance&rft.au=Beliga,%20Slobodan&rft.date=2021-12-24&rft.volume=7&rft.issue=12&rft.spage=e31540&rft.pages=e31540-&rft.issn=2369-2960&rft.eissn=2369-2960&rft_id=info:doi/10.2196/31540&rft_dat=%3Cproquest_doaj_%3E2594291560%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c457t-38b342afa7f420e676854f10f71cd593bc98764815acc920a1b4b3fd1017a20c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2615526876&rft_id=info:pmid/34739388&rfr_iscdi=true |