Access the full text.
Sign up today, get DeepDyve free for 14 days.
W. London, J. Yorke (1973)
Recurrent outbreaks of measles, chickenpox and mumps. I. Seasonal variation in contact rates.American journal of epidemiology, 98 6
K. Ma, W. Schaffner, C. Colmenares, J. Howser, J. Jones, K. Poehling (2006)
Influenza Vaccinations of Young Children Increased With Media Coverage in 2003Pediatrics, 117
P. Geoffard, T. Philipson (1996)
Rational Epidemics and Their Public ControlInternational Economic Review, 37
J. Bayham, N. Kuminoff, Q. Gunn, Eli Fenichel (2015)
Measured voluntary avoidance behaviour during the 2009 A/H1N1 epidemicProceedings of the Royal Society B: Biological Sciences, 282
M. Keeling, P. Rohani (2007)
Modeling Infectious Diseases in Humans and Animals
flu) reports (online);. Accessed: 2016-02-25. Available from
M. Majumder, Sheryl Kluberg, M. Santillana, S. Mekaru, J. Brownstein (2015)
2014 Ebola Outbreak: Media Events Track Changes in Observed Reproductive NumberPLoS Currents, 7
J. Epstein, Jon Parker, D. Cummings, Ross Hammond (2007)
Coupled Contagion Dynamics of Fear and Disease: Mathematical and Computational ExplorationsPLoS ONE, 3
Vasileios Lampos, N. Cristianini (2010)
Tracking the flu pandemic by monitoring the social web2010 2nd International Workshop on Cognitive Information Processing
J. Tchuenche, Nothabo Dube, C. Bhunu, Robert Smith, C. Bauch (2011)
The impact of media coverage on the transmission dynamics of human influenzaBMC Public Health, 11
Yanni Xiao, Sanyi Tang, Jianhong Wu (2015)
Media impact switching surface during an infectious disease outbreakScientific Reports, 5
M. Frank, Lewis Mitchell, P. Dodds, C. Danforth (2013)
Happiness and the Patterns of Life: A Study of Geolocated TweetsScientific Reports, 3
J. Cui, Yonghong Sun, Huaiping Zhu (2007)
The Impact of Media on the Control of Infectious DiseasesJournal of Dynamics and Differential Equations, 20
David Broniatowski, Michael Paul, Mark Dredze (2013)
National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza EpidemicPLoS ONE, 8
T. Philipson (1996)
Private Vaccination and Public Health: An Empirical Examination for U.S. MeaslesJournal of Human Resources, 31
Liangzhe Chen, K. Hossain, P. Butler, Naren Ramakrishnan, B. Prakash (2014)
Flu Gone Viral: Syndromic Surveillance of Flu on Twitter Using Temporal Topic Models2014 IEEE International Conference on Data Mining
G. Suter (2009)
Model based inference in the life sciences: A primer on evidence, by David R. AndersonIntegrated Environmental Assessment and Management, 5
D. Bell (2004)
Public Health Interventions and SARS Spread, 2003Emerging Infectious Diseases, 10
Byung-Kwang Yoo, Margaret Holland, J. Bhattacharya, C. Phelps, P. Szilagyi (2010)
Effects of mass media coverage on timing and annual receipt of influenza vaccination among Medicare elderly.Health services research, 45 5 Pt 1
Shannon Collinson, K. Khan, J. Heffernan (2015)
The Effects of Media Reports on Disease Spread and Important Public Health MeasurementsPLoS ONE, 10
S. Funk, V. Jansen (2013)
The Talk of the Town: Modelling the Spread of Information and Changes in Behaviour
Harshavardhan Achrekar, Avinash Gandhe, Ross Lazarus, Ssu-Hsin Yu, Benyuan Liu (2011)
Predicting Flu Trends using Twitter data2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)
M. Kremer (1996)
Integrating Behavioral Choice into Epidemiological Models of AIDSQuarterly Journal of Economics, 111
S. Vinogradov, V. Danilova, A. Troussov, Sergey Maruev (2017)
The Demographics of Social Media Users in the Russian-Language Internet
C. Zhu, R. Byrd, P. Lu, J. Nocedal (1997)
Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimizationACM Trans. Math. Softw., 23
B. Althouse, S. Scarpino, L. Meyers, J. Ayers, Marisa Bargsten, J. Baumbach, J. Brownstein, Lauren Castro, H. Clapham, D. Cummings, S. Valle, S. Eubank, Geoffrey Fairchild, L. Finelli, N. Generous, Dylan George, David Harper, Laurent Hébert-Dufresne, M. Johansson, K. Konty, M. Lipsitch, G. Milinovich, Joseph Miller, E. Nsoesie, D. Olson, Michael Paul, P. Polgreen, R. Priedhorsky, J. Read, I. Rodríguez-Barraquer, Derek Smith, C. Stefansen, D. Swerdlow, Deborah Thompson, Alessandro Vespignani, A. Wesolowski (2015)
Enhancing disease surveillance with novel data streams: challenges and opportunitiesEpj Data Science, 4
Sharon Alajajian, J. Williams, A. Reagan, Stephen Alajajian, M. Frank, Lewis Mitchell, Jacob Lahne, C. Danforth, P. Dodds (2015)
The Lexicocalorimeter: Gauging public health through caloric input and output on social mediaPLoS ONE, 12
M. Domenico, A. Lima, Marta González, A. Arenas (2015)
Personalized routing for multitudes in smart citiesEPJ Data Science, 4
(2008)
disease: mathematical and computational explorations. PLoS ONE
S. Funk, M. Salathé, V. Jansen (2010)
Modelling the influence of human behaviour on the spread of infectious diseases: a reviewJournal of The Royal Society Interface, 7
J. Shaman, A. Karspeck (2012)
Forecasting seasonal outbreaks of influenzaProceedings of the National Academy of Sciences, 109
Dongmei Xiao, Shigui Ruan (2006)
Global analysis of an epidemic model with nonmonotone incidence rateMathematical Biosciences, 208
Jeremy Ginsberg, Matthew Mohebbi, Rajan Patel, Lynnette Brammer, Mark Smolinski, Larry Brilliant (2009)
Detecting influenza epidemics using search engine query dataNature, 457
Shannon Collinson, J. Heffernan (2014)
Modelling the effects of media during an influenza epidemicBMC Public Health, 14
A. d’Onofrio, P. Manfredi, E. Salinelli (2007)
Vaccinating behaviour, information, and the dynamics of SIR vaccine preventable diseases.Theoretical population biology, 71 3
(2016)
Data from: A data-driven model for influenza transmission incorporating media effects
S. Codish, L. Novack, J. Dreiher, L. Barski, A. Jotkowitz, L. Zeller, V. Novack (2014)
Impact of Mass Media on Public Behavior and Physicians: An Ecological Study of the H1N1 Influenza PandemicInfection Control & Hospital Epidemiology, 35
D. Greenhalgh, Sourav Rana, Sudip Samanta, Tridip Sardar, S. Bhattacharya, J. Chattopadhyay (2015)
Awareness programs control infectious disease - Multiple delay induced mathematical modelAppl. Math. Comput., 251
Qian Zhang, C. Gioannini, D. Paolotti, N. Perra, Daniela Perrotta, M. Quaggiotto, M. Tizzoni, Alessandro Vespignani (2015)
Social Data Mining and Seasonal Influenza Forecasts: The FluOutlook Platform
I. Kiss, J. Cassell, M. Recker, P. Simon (2010)
The impact of information transmission on epidemic outbreaks.Mathematical biosciences, 225 1
D. Lazer, Ryan Kennedy, Gary King, A. Vespignani (2014)
The Parable of Google Flu: Traps in Big Data AnalysisScience, 343
J. Cui, X. Tao, Huaiping Zhu (2008)
An SIS Infection Model Incorporating Media CoverageRocky Mountain Journal of Mathematics, 38
CDC Influenza (flu) reports (online);. Accessed: 2016-02-25
H. Wearing, P. Rohani, M. Keeling (2005)
Appropriate Models for the Management of Infectious DiseasesPLoS Medicine, 2
Alex Lamb, Michael Paul, Mark Dredze (2013)
Separating Fact from Fear: Tracking Flu Infections on Twitter
K. Nicholson, M. Wiselka (1989)
Infectious Diseases: A ReviewJournal of the Royal College of Physicians of London, 23
Lewis Mitchell, K. Harris, M. Frank, P. Dodds, C. Danforth (2013)
The Geography of Happiness: Connecting Twitter Sentiment and Expression, Demographics, and Objective Characteristics of PlacePLoS ONE, 8
J. Yorke, W. London (1973)
Recurrent outbreaks of measles, chickenpox and mumps. II. Systematic differences in contact rates and stochastic effects.American journal of epidemiology, 98 6
A. Culotta (2010)
Towards detecting influenza epidemics by analyzing Twitter messages
Dongmei Xiao, Shigui Ruan
Global Analysis of an Epidemic Model with Nonmonotone Incidence Rate
Alessio Signorini, Alberto Segre, P. Polgreen (2011)
The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 PandemicPLoS ONE, 6
A data-driven model for influenza transmission rsos.royalsocietypublishing.org incorporating media effects Lewis Mitchell and Joshua V. Ross Research School of Mathematical Sciences, University of Adelaide, North Terrace, 5005 Adelaide, Cite this article: Mitchell L, Ross JV. 2016 A Australia data-driven model for influenza transmission LM, 0000-0001-8191-1997 incorporating media effects. R. Soc. open sci. 3: 160481. Numerous studies have attempted to model the effect of http://dx.doi.org/10.1098/rsos.160481 mass media on the transmission of diseases such as inﬂuenza; however, quantitative data on media engagement has until recently been difﬁcult to obtain. With the recent explosion of ‘big data’ coming from online social media and the like, Received: 23 August 2016 large volumes of data on a population’s engagement with Accepted: 22 September 2016 mass media during an epidemic are becoming available to researchers. In this study, we combine an online dataset comprising millions of shared messages relating to inﬂuenza with traditional surveillance data on ﬂu activity to suggest a functional form for the relationship between the two. Using Subject Category: this data, we present a simple deterministic model for inﬂuenza Mathematics dynamics incorporating media effects, and show that such a model helps explain the dynamics of historical inﬂuenza Subject Areas: outbreaks. Furthermore, through model selection we show that applied mathematics/health and disease the proposed media function ﬁts historical data better than and epidemiology/mathematical modelling other media functions proposed in earlier studies. Keywords: epidemiology, influenza, mathematical 1. Introduction modelling, social media, Twitter Traditional models of epidemics assume static parameter values over the course of an outbreak [1]. As such, they do not allow for changes in human behaviour which in turn are likely to Author for correspondence: impact the rate of transmission in a population. Such behavioural Lewis Mitchell changes in response to disease outbreaks are well established [2]. e-mail: lewis.mitchell@adelaide.edu.au This includes self-imposed social distancing during inﬂuenza pandemics [3], and the usage of face masks and changes in travel behaviour during the severe acute respiratory syndrome (SARS) outbreak of 2002–2004 [4]. The term prevalence elastic behaviour has arisen to explain voluntary protective behaviour which increases with disease prevalence [5], as has been observed for both measles [6]and HIV[7]. The close to real-time awareness of disease prevalence in an outbreak is now common due to the relatively recent explosion Electronic supplementary material is available in mass and social media. The past decade has seen signiﬁcant growth in studies concerning the interaction of media, human online at https://dx.doi.org/10.6084/m9. figshare.c.3512445 . 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ behaviour and infectious disease dynamics, and there now exists a substantial body of work on this topic [2,8–14]. Despite this growth, empirical studies of prevalence elastic behaviour due to mass media have until recently been difﬁcult due to the lack of availability of data directly measuring media engagement and relating it to behavioural change. As such, the vast majority of studies in this area can be broadly classiﬁed into two groups, with slightly different motivations. First, pure mathematical models of behavioural change, in which a model is formulated that accounts for how dynamics are inﬂuenced by disease awareness or prevalence, typically facilitated through media—these are often either in the form of introducing new states which account for the behavioural status of individuals [15], by allowing modiﬁcation to the contact structure [3,16], or by allowing modiﬁcation to the model parameters [17,18]—and the consequences are then explored. Collinson et al. [13] model behavioural change due to media by explicitly including a compartment for individuals inﬂuenced by mass media into an SEIR-type model, also incorporating effects like vaccination and social distancing [13]. This study is of particular interest due to the fact that it incorporates a ‘media fatigue’ effect during the 2009/2010 H1N1 pandemic by ﬁtting to news report data collected from newspaper homepages during the pandemic. Second, pure statistical models of media and prevalence are used on large datasets to produce statistical regression models relating some measure of volume of media concerning epidemics to the prevalence of infection [9,10] or reproductive number [19]. Such models have recently become popular due to the rapid increase in new data streams coming from Internet and online social media usage [20,21]. The study of Signorini et al. [11] is an exception to this trend: while it is a pure statistical model, it includes an investigation of the relationship between ‘tweets’ on Twitter and public sentiment with respect to H1N1 [11]. The FluOutlook platform [14] is also particularly interesting; by using a variety of data sources, including Twitter, to initialize a global agent-based epidemiological model it is able to produce real-time forecasts of an evolving inﬂuenza season. Here our focus is on simple models for incorporating behavioural changes from awareness of disease prevalence, through modiﬁcation to the effective transmission rate parameter. We measure disease dynamics through inﬂuenza incidence data from the United States over the period 1998/1999– 2014/2015, and human behaviour through social media data collected from Twitter over the period 2009/2010–2014/2015. Modiﬁcation to the effective transmission rate is via a so-called media function. Three distinct media functions have been introduced, and recently compared, in the literature [22]. A potential criticism of pure mathematical model-based studies, as described above, is that the usefulness of the model when analysing real data is uncertain. In fact, as we will show here, some of these models have only very limited use in describing data coming from historical inﬂuenza outbreaks. On the other hand, while pure statistical models of media and prevalence are potentially of use for detecting and tracking disease incidence, they are subject to typical criticisms of ‘big data’ analyses [23]as containing biases and tending towards overﬁtting. As such, their usefulness for understanding potential mechanisms of impacts, as is the focus of model-based analyses, is limited. We propose a data-driven approach that couples these existing paradigms: through a statistical analysis of data on media engagement and disease prevalence we develop a mathematical model of behaviour change which may then be validated against data. Our approach uses online social media data from Twitter alongside surveillance data on inﬂuenza to inform the form of the media function. The motivation is that by using both sources of data, we have some empirical justiﬁcation for the form of the chosen media function and can also better describe real observations. By using model selection criteria, we show that the media function proposed here ﬁts historical surveillance data better than other media functions proposed in earlier studies. The structure of the remainder of this paper is as follows: in §2, we describe the dataset and model used; in §3, we show results comparing our proposed model with surveillance data, and then conclude with a discussion in §4. 2. Material and methods In order to measure media engagement, we use a corpus of over 2.9 million geolocated, ﬂu-related tweets collected from the contiguous United States between September 2009 and July 2015. This sample was provided by the Computational Story Lab at the University of Vermont, and is a subset of Twitter ’s ‘garden hose’ feed, representing approximately 10% of all public messages posted to the platform. In this study, we consider only tweets which contain one or more of the strings ‘ﬂu’, ‘#ﬂu’, ‘inﬂuenza’ or ‘#inﬂuenza’. Furthermore, we will focus on ‘retweeted’ messages, where an individual has opted to reshare a tweet originally authored by someone else with their own followers by means of a retweet rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ button within the Twitter interface or by appending the string ‘RT’ to the beginning of the original message. Such messages account for approximately 30% of the corpus and are mainly resharings of ﬂu- related articles from major news outlets, but can also contain retweets of messages authored by regular Twitter users. We use a deterministic SEEIIR-M model (susceptible–exposed–infected–recovered with media, with two compartments for exposed and infected individuals) to model the transmission of inﬂuenza under the inﬂuence of media effects: S =−βf (I)SI, (2.1) E = βf (I)SI − 2σ E , (2.2) 1 1 E = 2σ E − 2σ E , (2.3) 2 1 2 I = 2σ E − 2γ I , (2.4) 1 2 1 I = 2γ I − 2γ I (2.5) 2 1 2 and R = 2γ I , (2.6) where S, E , E , I , I and R represent the proportions of the population in each compartment, S + E + 1 2 1 2 1 E + I + I + R = 1, β represents the effective transmission rate in the absence of media effects, 1/σ 2 1 2 represents the average latent period, 1/γ represents the average infectious period and f (I) is the so-called media function which represents the reduction in transmission of the disease through the inﬂuence of mass media. Consequently, 0 ≤ f (I) ≤ 1 with f (I) ≡ 1 implying no effect of media upon transmission, and we will assume f (I) is monotonically decreasing in I. Setting f (I) ≡ 1 recovers the standard SEEIIR model. As f (0) = 1 for each media function, the basic reproduction number R = β/γ , which is independent of f (I). The two compartments for the exposed and infectious periods mean that these periods have underlying Erlang-2 distributions with mean exposed and infectious periods 1/σ and 1/γ respectively, which have been shown to more accurately represent the shape of observed distributions [24]. We found similar results using standard SEIR-type models; these results are presented in the electronic supplementary material. Note that we have not included vaccination in our model, for two reasons: ﬁrstly, for comparison with media models from previous studies (see below) which use SEIR-type models without vaccination; and secondly, because vaccination coverage in adults has remained approximately constant since 2010 [25] and the Twitter data we will study primarily relates to media reporting around the peak of the inﬂuenza season rather than the earlier peak of the vaccination season. Our model, therefore, can essentially be considered as a model for inﬂuenza dynamics in the unvaccinated portion of the population. Previous studies have postulated a number of different forms for f (I); see [22] for a recent review. In particular, Cui et al. [26]set f (I) = exp(−p I) (2.7) 1 1 within an SEI model, Xiao & Ruan [27]used f (I) = (2.8) 1 + p I within an SIR model to account for the psychological effects of a large population infected with SARS, and many authors (e.g. [28,29]) set f (I) = (2.9) 1 + p I to account for various effects including media coverage. To compare the model outputs with real data, we use inﬂuenza surveillance data provided by the US Centre for Disease Control (CDC) [25]. Speciﬁcally, we ﬁt models to the nation-wide percentage of new laboratory-conﬁrmed inﬂuenza cases per week. We ﬁnd best ﬁts for the free model parameters to the surveillance data by minimizing least-square error between model solutions and surveillance data using a limited memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS-M) method [30], implemented in Python. To ensure the numerical stability of the numerical optimization routine, we constrain R to be between 1 and 2, the mean infectious period 1/γ to be between 1 and 5 days, and the mean exposed time 1/σ to be between 1 and 3 days. To perform model selection, we use the Akaike Information Criterion (AIC) with ﬁnite sample size correction. rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ 3. Results 4 We use the act of sharing a message pertaining to inﬂuenza as a proxy for an individual engaging with media about an inﬂuenza outbreak. While this act of sharing does not necessarily imply that the individual will change their behaviour, it does suggest that the individual is at least somewhat concerned by the media surrounding the inﬂuenza outbreak. Figure 1 shows the relationship between proportion of US-based tweets which were retweets concerning inﬂuenza (that is, number of retweets containing one or more of the strings ‘inﬂuenza’, ‘ﬂu’, ‘#inﬂuenza’ or ‘#ﬂu’ divided by the total number of tweets) and the number of inﬂuenza-like-illness (ILI) cases per week for the 2009/2010 to 2014/2015 inﬂuenza seasons, expressed as a percentage of the total number of visits to sentinel providers. The data on weekly counts of ILI activity and retweeting rates used can be found in the electronic supplementary material. We chose to ﬁt to ILI activity rather than laboratory-conﬁrmed inﬂuenza incidence because we expect individuals to tend to share ﬂu-related information on social media upon feeling ill, rather than strictly once they are conﬁrmed to have inﬂuenza. The 2009/2010 pandemic (plotted in the lower left subplot) stands out as having the largest number of both ILI cases and retweet activity. We observe strong Pearson’s correlations between retweets and inﬂuenza activity for 3 out of the 6 years plotted—in 2009/2010, 2012/2013 and 2013/2014 (p < 0.01). Importantly, while the relationship between media engagement and ﬂu activity is small, it is roughly linear for most ﬂu seasons plotted. Using AIC to test linear and quadratic models for the data, we found that the linear model was selected in all seasons apart from 2009/2010. We show linear and quadratic ﬁts for this season as well as 2014/2015 in the subplots below the main ﬁgure. In 2014/2015, the linear model was slightly preferred with a relative Akaike weight of 0.58 to 0.42 for the quadratic model, while in 2009/2010 the quadratic model was slightly preferred with a relative Akaike weight of 0.55 to 0.45 for the linear model. Note that as demonstrated by the model ﬁts in the two subﬁgures, the Akaike weights indicate that there is substantial support in the data for both the linear and quadratic models. Indeed, we found that the relative likelihood of the quadratic model increased with the total number of ILI cases per season (see electronic supplementary material), suggesting that nonlinear media effects may become increasingly relevant during more severe outbreaks. We also present residual plots for the linear and quadratic models for all years in the electronic supplementary material, showing no obvious non-random patterns for the model ﬁts, along with further details of the AIC model selection and a table of relative Akaike weights for all years. Note also that we observed similar-looking relationships between media engagement and inﬂuenza activity when using the number of comments on ﬂu-related articles in the New York Times between 2001 and 2013 as our metric for media engagement. However, due to the smaller amount of data we could only ﬁnd a statistically signiﬁcant correlation between the two during the 2009 pandemic. Based upon these observations and for simplicity in comparing models, we propose the following simple linear media function to describe the reduction in transmission due to media effects: f (I) = 1 − p I, (3.1) m m where p is a parameter (to be ﬁtted) describing the reduction in actual transmission due to concern from media coverage. Yorke & London applied a similar function in a different context, to model exposure rates for seasonal measles outbreaks [31]. Note that in order to assure that 0 ≤ f (I) ≤ 1 it will be necessary to constrain p such that 0 ≤ p ≤ 1, as I ≤ 1 always. This is in contrast with the media functions (2.7)– m m (2.9), for which the parameters can take on any value p , p , p ≥ 0. We remark that while an obvious 1 2 3 extension for larger outbreaks would be to use a quadratic media function f (I) = 1 − p I − p I ,for m m1 m2 ease of comparison with existing media functions, we will only consider the one-parameter model (3.1). We show an example of the effect of the media function f upon the dynamics in ﬁgure 2,where we −1 −1 have set p = 0.05, R = 1.5, γ = 1/2 (d) , σ = 1/2 (d) and have plotted E = E + E and I = I + I . m 0 1 2 1 2 The media function reduces the total number of infected persons (i.e. the ﬁnal size of the epidemic) and size of the peak, while not notably changing the timing of the peak. The slower rate of depletion of susceptibles means that the infection dies out slightly slower in the model with media effect. To investigate how well the various transmission models, both with and without media effects, describe real inﬂuenza outbreaks, we ﬁt (2.1)–(2.6) with f (I) ≡ 1 as well as (2.7)–(3.1) to weekly laboratory- conﬁrmed inﬂuenza incidence data for the 1998–2013 ﬂu seasons using least squares. Note that unlike social media engagement which can be reasonably expected to relate to ILI, it is appropriate to ﬁt models of the underlying disease dynamics to conﬁrmed inﬂuenza incidence data only. Using the L-BFGS-B method, we ﬁnd parameter values R , σ , γ and media parameter p which best ﬁt the data. The best- 0 m ﬁtting parameters for each model for the 2013/2014 ﬂu season are shown in table 1, and for all other rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ –5 ×10 1.6 1.4 1.2 1.0 0.8 0.6 2010/2011 0.4 2011/2012 2012/2013 0.2 2013/2014 1 2 3 4 5 6 ILI (%) –5 –6 ×10 ×10 8 8 2009/2010 2014/2015 quadratic 7 7 linear 6 6 5 5 4 4 3 3 quadratic 2 2 1 linear 1 0 0 18 2 3 4 5 6 7 16 2 3 4 5 ILI (%) ILI (%) Figure 1. Media engagement from Twitter data. Correlation between proportion of public retweets regarding ‘flu’ and number of influenza-like-illness (ILI) cases, 2009/10–2014/15. ILI data are expressed as a percentage of the total number of visits to sentinel surveillance providers. Linear trend lines are shown for the years showing significant ( p < 0.01) correlation. The subfigures show both quadratic and linear fits to the data for the 2009/2010 and 2014/2015 seasons. seasons are shown in the electronic supplementary material. We ﬁt observations from four weeks before the peak to 12 weeks after the peak. Also shown in table 1 are the average conditional probabilities for each model, as obtained from the normalized Akaike weights for each model across all ﬂu seasons between 1998/1999 and 2014/2015 in which a non-zero media function was found. In ﬁgure 3, we show example ﬁts to observations of the percentage of new laboratory-conﬁrmed inﬂuenza cases per week (blue) for the model with no media effect (red) and media functions given by f (green), f (cyan), f (magenta) and f (yellow) for the 2013/2014 inﬂuenza season. As with the ILI m 1 2 3 data, the laboratory-conﬁrmed case data are expressed as a percentage of the total number of visits to sentinel surveillance providers. The inset plot shows the corresponding media functions with best-ﬁtting parameters. While no model is able to correctly estimate the size of the peak of the infection, the model with linear media function f is the only one which correctly identiﬁes the week in which the infection peaks. The media functions f and f are also the only models which describe the decay of the infection m 1 post-peak well. While the media function f (I) was derived based upon Twitter data, our intention in focusing on news-sharing behaviours is to model the effects of mass media more generally. Indeed, we might expect that population-level engagement with other forms of mass media show a similar monotonically decreasing relationship between media coverage and transmission. To that end, we now apply the proportion of ‘flu’ RTs proportion of ‘flu’ RTs rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ 1.0 0.040 without media 0.035 0.9 with media 0.030 0.8 0.025 S 0.7 E 0.020 0.015 0.6 0.010 0.5 0.005 0.4 02 5 10 15 205 0 5 10 15 20 25 0.040 0.6 0.035 0.5 0.030 0.4 0.025 0.3 I 0.020 R 0.015 0.2 0.010 0.1 0.005 0 5 10 15 20 25 0 5 10 15 20 25 time (weeks) time (weeks) Figure 2. Sample time series showing the effect of the media function f (I) = 1 − p I. The media function reduces the final epidemic m m size and slows the decay rate of the outbreak. 0.8 1.0 0.9 0.7 0.8 0.6 0.7 0 0.2 0.4 0.6 0.8 incidence (%) 0.5 0.4 0.3 0.2 0.1 0 2468 10 12 14 16 week Figure3. Bestmodelfitstodatafrom2013/2014fluseason.Weeklylaboratory-confirmedinfluenzaincidencedata(bluedots),andmodel fits for SEEIIR model without media function (red), with media function (green) and variations f (cyan), f (magenta) and f (yellow), 1 2 3 for the 2013/2014 influenza season (USA). Incidence data are normalized by the total number of visits to sentinel providers. Table 1. Parameters of best fit for SEEIIR and SEEIIR-M models for 2013/2014 influenza season. f(I) ≡ 1 f f f f m 1 2 3 R 1.1574 1.5101 1.8574 1.4949 1.9281 ........................................ ............................................ ........................................... .............................................. ............................................ 1/σ (days) 1 1.6881 2.2162 1.9873 1.3794 ........................................ ............................................ ........................................... .............................................. ............................................ 1/γ (days) 1.1979 1 1.2719 1.0979 1.1001 ........................................ ............................................ ........................................... .............................................. ............................................ p — 0.3316 0.1543 0.7381 0.8140 ........................................ ............................................ ........................................... .............................................. ............................................ −9 −8 −7 −5 p O (10 ) >0.9999 O (10 ) O (10 ) O (10 ) AIC ........................................ ............................................ ........................................... .............................................. ............................................ incidence (%) f (l) rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ (a) 0.25 0.20 0.15 0.10 0.05 no media f f f f m 1 2 3 (b) 3 –1 –2 –3 –4 no media f f f f m 1 2 3 (c) no media f f f f m 1 2 3 Figure 4. Boxplots of RMS error (a), peak timing error (b) and final epidemic size error ( c), for standard model and models with media effects for 1998/1999–2014/2015 seasons. The 2009/2010 pandemic influenza season has been excluded. Blue dots show results from individual years (where we have added random jitter for visibility), crosses show outliers. Table 2. Average probability of selecting each model over the 1998/1999–2014/2015 seasons. We have fitted over a 16-week period for each season. The 2009/2010 pandemic influenza season has been excluded. p 95% CI AIC f (I) ≡ 1 0.0500 [0,0.1398] .......................................... ........................................... .......................................... ............................................ .............................................. f (I) = 1 − p I 0.8347 [0.6674,1] m m .......................................... ........................................... .......................................... ............................................ .............................................. f (I) = exp(−γ p I) 0.0267 [0,0.0535] 1 1 .......................................... ........................................... .......................................... ............................................ .............................................. f (I) = 0.0060 [0.0001,0.0120] 2 2 1+p I .......................................... ........................................... .......................................... ............................................ .............................................. f (I) = 0.0826 [0,0.1871] 1+p I .......................................... ........................................... .......................................... ............................................ .............................................. proposed media function to all inﬂuenza seasons we have incidence data for, and ﬁnd similar results for most seasons between 1998/1999 and 2014/2015. Table 2 shows the average conditional probability of selecting each model, where the average is taken over all years in which a media function is required at all. Also shown are the 95% conﬁdence intervals for each average conditional probability. No media functions of any kind were required to describe the 2003/2004 ﬂu season, f gave the best ﬁt to observations in 2006/2007 only, and f gave the best ﬁt in 2009/2010 only. We next examine how well models with and without media function estimate the complete epidemic curve, as well as the peak timing and severity. Figure 4 shows boxplots of (i) RMS error, (ii) peak timing error and (iii) ﬁnal epidemic size error, for the model with no media effect as well as media functions as deﬁned in (2.7)–(3.1) over the 1998/1999–2014/2015 seasons. The proposed media function f signiﬁcantly outperforms all other models with or without media effects at ﬁtting the epidemic curve, final epidemic size error peak timing error RMS error (% of population) (weeks) rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ 1.0 without media with media 0.8 0.6 0.4 0.2 510 15 20 weeks before peak Figure 5. Quality of model fits as a function of lead time. Probability that the no-media model (green) and media model with f (I) = 1 − p I (blue) are the better model as a function of number of weeks before peak. In each case, we fit models to data over a 16-week period. The 2009/2010 pandemic influenza season has been excluded. with the distribution of RMS errors signiﬁcantly less spread and centred closer to 0 than all other models. All models with media effect are signiﬁcantly better than the standard model at matching the observed peak timing of an outbreak (Mood’s median test, p = 0.05), although there is no signiﬁcant difference between the four models. Similarly, there is no signiﬁcant difference across models in explaining the observed ﬁnal epidemic size, in fact the median error for the standard model without media effect is slightly lower than that for the models with media functions (however, this difference is not signiﬁcant). We remark that much of the improvement made by the media function f comes from better describing the post-peak period. The no media (i.e. f (I) ≡ 1) model becomes preferable as more of the data leading up to the peak of each season is used to ﬁt the models. In ﬁgure 5, we show the average conditional probability of selecting each model as a function of the number of weeks of data used before the peak. The no media and f models are always preferred over the other media function models (i.e. using f , f and f ), with the f model being preferred up until around 10–12 weeks before the peak. 1 2 3 m When ﬁtting data earlier than 12 weeks before the peak the no media model is preferred, suggesting that the effect of media coverage becomes more important later in the season. Furthermore, neither model is able to reliably predict the peak of the infection in terms of either size or timing based upon data from before the peak only. This suggests that in order to make accurate predictions and estimate parameters rather than explain an existing dataset when only small amounts of data are available, we must use a more advanced methodology such as data assimilation [32]. 4. Discussion Mass media is clearly an important tool for changing peoples’ behaviour during disease outbreaks. A better understanding of the relationship between media coverage of outbreaks and subsequent behavioural change can aid mathematical modelling efforts, as well as the development of public policy around the best use of this resource to inform the public and control the spread of a disease. By using data collected from Twitter, we have proposed a new, simple media function to describe the reduction in disease transmission due to media effects. When incorporated into a deterministic SEEIIR model, this media function describes incidence data better than a model without media effects, and better than previously proposed media functions. We observed a relationship between outbreak size and media awareness, with a quadratic model becoming more likely as the ﬁnal size of the outbreak increased. This suggests that the relationship between media coverage and infection rates is nonlinear, especially in more severe seasons. Future extensions to the media function could incorporate extra reductions in disease transmission due to factors such as early media coverage, pre-existing immunity or seasonality. Public awareness campaigns could lead to an increase in early-season social media activity and sharing of news articles, and could be implemented in the current model via a time lag. Indeed, we observed such an effect for the 2014/2015 season, where changes in retweeting activity preceded ILI rates by a number of weeks. Mass media campaigns have been shown to increase ﬂu-related hospital visits [33]and p (model preferred) rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ vaccination rates [34,35]. It is further possible that any potential reduction in transmission in one season due to the effects of mass media could decrease pre-existing immunity for the next season, an effect which could be modelled by conditioning the media function on the total amount of media engagement from the previous season. Identifying any such potential process is of course confounded by the presence of multiple inﬂuenza strains circulating in any particular season with differing levels of pre-exisitng immunity; modelling such a hierarchy of time-lagged effects requires a more sophisticated strategy and is left for future work. The interplay between mass media, social network inﬂuence, human behavioural change and disease transmission is complex, and this work merely scratches the surface of the processes which could be modelled using this framework. Further extensions could build upon efforts to incorporate interactions between social and contact network structures into the model [36] by inferring the mass media effect directly from social network data. There is also an emerging body of work around using open data to infer human behaviours such as mobility patterns [37] and voluntary avoidance [38]. The same data used here to track media engagement could potentially be exploited to quantify such effects, as well as to develop a proxy for real-time surveillance on practices such as vaccination, which we aim to incorporate into future reﬁnements of this model. A critical assumption made in this work is that the population is homogeneously mixing and not age-stratiﬁed. This is of course far from being the case for Twitter users—indeed, it is well known that the demographics of Twitter use in the United States are biased towards adults aged 18–29, African-Americans and urban residents [39], and word usage has been shown to correlate with a number of socio-economic and health characteristics [40,41]. Despite these biases, the approximately 10% of American adults who are estimated to use Twitter represents a far larger sample size than those of traditional surveys. Furthermore, for simplicity and because the keywords we used were sufﬁciently speciﬁc, we did not ﬁlter tweets for relevance. Manual examination of a sample of tweets indicated that an insigniﬁcant number of tweets were misclassiﬁed as being about inﬂuenza; however, constraining the tweet corpus may lead to further improvements in the results. This work ﬁts into a growing ﬁeld of research on disease prediction using open data [42], particularly from social network usage. Great advances have already been made on algorithms to predict rare and seasonal diseases, especially in the computer science literature [43]. Our results represent a ﬁrst attempt at incorporating this emerging data stream into more traditional modelling efforts, and hopefully at better understanding the interactions between media and disease dynamics. Data accessibility. Data available from the Dryad Digital Repository at http://dx.doi.org/10.5061/dryad.593cc [44]. Authors’ contributions. L.M. and J.V.R. conceived of the study; L.M. performed data analysis and simulations; L.M. and J.V.R. wrote the manuscript. Competing interests. We have no competing interests. Funding. The authors received funding from Data To Decisions Cooperative Research Centre (D2D CRC). J.V.R. received funding from the Australian Research Council through the Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), the Future Fellowship scheme (FT130100254), and the National Health and Medical Research Council (NHMRC) Centre of Research Excellence for Policy Relevant Infectious Disease Simulation and Mathematical Modelling (PRISM ). Acknowledgements. The authors wish to thank P.S. Dodds and C.M. Danforth from the Computational Story Lab at the University of Vermont for use of the Twitter Gardenhose feed for this study. References 1. Keeling MJ, Rohani P. 2007 Modeling infectious 5. Geoffard PY, Philipson T. 1996 Rational epidemics of the first workshop on social media analytics, diseases in humans and animals. Princeton, NJ: and their public control. Int. Econ. Rev. 37, 603–624. Washington, DC, July, pp. 115–122. Princeton University Press. (doi:10.2307/2527443) 10. Lampos V, Christianini N. 2010 Tracking the flu 2. Funk S, Salathé M, Jansen VAA. 2010 Modelling the 6. Philipson T. 1996 Private vaccination and public pandemic by monitoring the social web. In 2nd Int. influence of human behaviour on the spread of health: an empirical examination for U.S. measles. Workshop on Cognitive Information Processing, Elba infectious diseases: a review. J. R. Soc. Interface 7, J. Hum. Resour. 31, 611–630. (doi:10.2307/146268) Island, Italy, June, pp. 411–416. 1247–1256. (doi:10.1098/rsif.2010.0142) 7. Kremer M. 1996 Integrating behavioral choice into 11. Signorini A, Segre AM, Polgreen PM. 2011 The 3. Epstein JM, Parker J, Cummings D, Hammond RA. epidemiological models of AIDS. Q. J. Econ. 111, use of Twitter to track levels of disease 2008 Coupled contagion dynamics of fear and 549–573. (doi:10.2307/2946687) activity and public concern in the U.S. during disease: mathematical and computational 8. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, the influenza A H1N1 pandemic. PLoS explorations. PLoS ONE 3, e3955. (doi:10.1371/ Smolinski MS, Brilliant L. 2009 Detecting influenza ONE 6,e19467.(doi:10.1371/journal.pone. journal.pone.0003955) epidemics using search engine query data. Nature 0019467 ) 4. Bell DM et al. 2004 Public health interventions and 457, 1012–1014. (doi:10.1038/nature07634) 12. Lamb A, Paul MJ, Dredze M. 2013 Separating fact SARS spread, 2003. Emerg. Infect. Dis. 10, 9. Culotta A. 2010 Towards detecting influenza from fear: tracking flu infections on Twitter. 1900–1906. (doi:10.3201/eid1011.040729) epidemics by analyzing Twitter messages. In Proc. In NAACL-HLT 2013, Atlanta, GA, June, pp. 789–795. rsos.royalsocietypublishing.org R. Soc. open sci. 3: 160481 ................................................ 13. Collinson S, Khan K, Heffernan JM. 2015 The effects 22. Collinson S, Heffernan JM. 2014 Modelling the 35. Yoo BK, Holland ML, Bhattacharya J, Phelps CE, of media reports on disease spread and important effects of media during an influenza epidemic. BMC Szilagyi PG. 2010 Effects of mass media coverage on public health measurements. PLoS ONE 10, Public Health 14,376.(doi:10.1186/1471-2458-14-376) timing and annual receipt of influenza vaccination e0141423. (doi:10.1371/journal.pone. 23. Lazer D, Kennedy R, King G, Vespignani A. 2014 The among medicare elderly. Health Serv. Res. 45, 0141423) parable of Google Flu: traps in big data analysis. 1287–1309. (doi:10.1111/j.1475-6773.2010.01127.x) 14. Zhang Q, Gioannini C, Paolotti D, Perra N, Perrotta Science 343,1203–1205. 36. Funk S, Jansen VAA. 2012 The talk of the town: D, Quaggiotto M, Tizzoni M, Vespignani A. 2015 (doi:10.1126/science.1248506) modelling the spread of information and changes Social data mining and seasonal influenza 24. Wearing HJ, Rohani P, Keeling MJ. 2005 Appropriate in behaviour. In Modeling the Interplay Between forecasts: the FluOutlook platform. In Machine models for the management of infectious diseases. Human Behavior and the Spread of Infectious learning and knowledge discovery in databases (eds PLoS Med. 2, 0621–0627. (doi:10.1371/journal.pmed. Diseases (eds P Manfredi, A D’Onofrio), pp. 93–102. A Bifet, M May, B Zadrozny, R Gavalda, D Pedreschi, 0020174) New York, NY: Springer. F Bonchi, J Cardoso, M Spiliopoulou). Lecture 25. CDC Influenza (flu) reports (online). See http:// 37. Frank MR, Mitchell L, Dodds PS, Danforth CM. 2013 Notes in Computer Science, vol. 9286, pp. www.cdc.gov/flu/ (accessed25February2016). Happiness and the patterns of life: a study of 237–240. Berlin, Germany: Springer International 26. Cui J, Sun Y, Zhu H. 2007 The impact of media on the geolocated tweets. Sci. Rep. 3,2625.(doi:10.1038/ Publishing. control of infectious diseases. J. Dyn. Differ. Equ. 20, srep02625) 15. D’Onofrio A, Manfredi P, Salinelli E. 2007 31–53. (doi:10.1007/s10884-007-9075-0) 38. Bayham J, Kuminoff NV, Gunn Q, Fenichel EP. Vaccinating behaviour, information, and the 27. Xiao D, Ruan S. 2007 Global analysis of an epidemic 2015 Measured voluntary avoidance behaviour dynamics of SIR vaccine preventable diseases. Theor. model with nonmonotone incidence rate. Math. during the 2009 A/H1N1 epidemic. Proc. R. Popul. Biol. 71,301–317.(doi:10.1016/j.tpb.2007. Biosci. 208,419–429.(doi:10.1016/j.mbs.2006. Soc. B 282, 20150814. (doi:10.1098/rspb.2015. 01.001) 09.025) 0814) 16. Greenhalgh D, Rana S, Samanta S, Sardar T, 28. Cui J, Tao X, Zhu H. 2008 An SIS infection model 39. Duggan M, Brenner J. 2012 The demographics of Bhattacharya S, Chattopadhyay J. 2015 Awareness incorporating media coverage. Rocky Mt. J. Math. social media users—2012. Pew Research Center. programs control infectious disease—multiple 38, 1323–1334. (doi:10.1216/RMJ-2008-38-5-1323) See http://pewinternet.org/Reports/2013/Social- delay induced mathematical model. Appl. Math. 29. Tchuenche JM, Dube N, Bhunu CP, Smith RJ, Bauch media-users.aspx. Comput. 251, 539–563. (doi:10.1016/j.amc. CT. 2011 The impact of media coverage on the 40. Mitchell L, Frank MR, Harris KD, Dodds PS, 2014.11.091) transmission dynamics of human influenza. BMC Danforth CM. 2013 The geography of happiness: 17. Kiss IZ, Cassell J, Recker M, Simon PL. 2010 The Public Health 11 (Suppl 1), S5. (doi:10.1186/1471- connecting Twitter sentiment and expression, impact of information transmission on epidemic 2458-11-S1-S5) demographics, and objective characteristics of outbreaks. Math. Biosci. 225,1–10.(doi:10.1016/j. 30. Zhu C, Byrd RH, Lu P, Nocedal J. 1997 Algorithm 778: place. PLoS ONE 8, e64417. (doi:10.1371/journal. mbs.2009.11.009) L-BFGS-B: Fortran subroutines for large-scale pone.0064417) 18. Xiao Y, Tang S, Wu J. 2015 Media impact switching bound-constrained optimization. ACM Trans. Math. 41. Alajajian SE, Williams JR, Reagan AJ, Alajajian SC, surface during an infectious disease outbreak. Sci. Softw. 23, 550–560. (doi:10.1145/279232.279236) Frank MR, Mitchell L, Lahne J, Danforth CM, Dodds Rep. 5, 7838. (doi:10.1038/srep07838) 31. Yorke JA, London WP. 1973 Recurrent outbreaks of PS. 2015 The Lexicocalorimeter: gauging public 19. Majumdar M, Kluberg S, Santillana M, Mekaru S, measles, chicken-pox and mumps. I. The role of health through caloric input and output on social Brownstein J. 2015 2014 Ebola outbreak: media seasonality.Am.J.Epidemiol. 98, 453–468. media. Preprint. See http://arxiv.org/abs/ events track changes in observed reproductive 32. Shaman J, Karspeck A. 2012 Forecasting seasonal 150705098. number. PLoS Curr. Outbreaks.(doi:10.1371/currents. outbreaks of influenza. Proc. Natl Acad. Sci. USA 109, 42. Althouse B et al. 2015 Enhancing disease outbreaks.e6659013c1d7f11bdab6a20705d1e865) 20425–20430. (doi:10.1073/pnas.1208772109) surveillance with novel data streams: challenges 20. Achrekar H, Gandhe A, Lazarus R, Yu SH, Liu B. 2011 33. Codish S, Novack L, Dreiher J, Barski L, Jotkowitz A, and opportunities. EPJ Data Sci. 4,1–8.(doi:10.1140/ Predicting flu trends using Twitter data. In 2011 IEEE Zeller L, Novack V. 2014 Impact of mass media on epjds/s13688-015-0054-0) Conf. on Computer Communictations Workshops public behavior and physicians: an ecological study 43. Chen L, Hossain KSMT, Butler P, Ramakrishnan N, (INFOCOM WKSHPS), Shanghai, China, April, of the H1N1 influenza pandemic. Infect. Control Hosp. Prakash BA. 2014 Flu gone viral: syndromic pp. 702–707. Epidemiol. 35, 709–716. (doi:10.1086/676426) surveillanceoffluonTwitterusingtemporaltopic 21. Broniatowski DA, Paul MJ, Dredze M. 2013 National 34. MaK,Schanff erW,ColmenaresC,HowserJ,JonesJ, models. In Int. Conf. on data mining, pp. 755–760. and local influenza surveillance through Twitter: Poehling K. 2006 Influenza vaccinations of young 44. Mitchell L, Ross JV. 2016 Data from: A data-driven an analysis of the 2012-2013 influenza epidemic. children increased with media coverage in 2003. model for influenza transmission incorporating PLoS ONE 8, e83672. (doi:10.1371/journal.pone. Pediatrics 117,e157–e163.(doi:10.1542/peds. media effects. Dryad Digital Repository. 0083672) 2005-1079) (doi:10.5061/dryad.593cc)
Royal Society Open Science – Pubmed Central
Published: Oct 26, 2016
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.