3 Model description

Here we describe the methods of MOSAIC version 1.0. This model version provides a starting point for understanding cholera transmission in Sub-Saharan Africa, incorporating important drivers of disease dynamics such as human mobility, environmental conditions, and vaccination schedules. As MOSAIC continues to evolve, future iterations will refine model components based on available data and improved model mechanisms, which we hope will increase its applicability to real-world scenarios.

The model operates on daily time steps and will be fitted to historical incidence data, however current development is based on data from January 2023 to August 2024 and includes 40 countries in Sub-Saharan Africa (SSA), see Figure 3.1 and the Table of MOSAIC framework countries.

A map of Sub-Saharan Africa with countries that have experienced a cholera outbreak in the past 5 and 10 years highlighted in green. The 40 countires included in the MOSAIC modeling framework are indicated in blue.

Figure 3.1: A map of Sub-Saharan Africa with countries that have experienced a cholera outbreak in the past 5 and 10 years highlighted in green. The 40 countires included in the MOSAIC modeling framework are indicated in blue.

3.1 Transmission dynamics

The model has a metapopulation structure with familiar compartments for Susceptible, Exposed, Infected, and Recovered individuals with SEIRS dynamics. The model also contains compartments for one- and two-dose vaccination ($V_1$ and $V_2$) and Water & environment based transmission (W) which we refer to as SVEIWRS.

This diagram of the SVEIWRS (Susceptible-Vaccinated-Exposed-Infected-Water/environmental-Recovered-Susceptible) model shows model compartments as circles with rate parameters displayed. The primary data sources the model is fit to are shown as square nodes (Vaccination data, and reported cases and deaths).

Figure 3.2: This diagram of the SVEIWRS (Susceptible-Vaccinated-Exposed-Infected-Water/environmental-Recovered-Susceptible) model shows model compartments as circles with rate parameters displayed. The primary data sources the model is fit to are shown as square nodes (Vaccination data, and reported cases and deaths).

The SVEIWRS metapopulation model, shown in Figure 3.2, is governed by the following difference equations:

\[\begin{equation} \begin{aligned} \mathbf{\text{Susceptible population:}}\\[1mm] S_{j,t+1} = \ & S_{jt} + b_{jt}\,N_{jt} + \varepsilon\,R_{jt} - \frac{\nu_{1,jt}\,S_{jt}}{\left(S_{jt}+E_{jt}\right)} - \left( \Lambda^{S}_{j,t+1} + \Psi^{S}_{j,t+1} \right) - d_{jt}\,S_{jt}\\[3mm] \mathbf{\text{One-dose vaccination:}}\\[1mm] V^{\text{imm}}_{1,j,t+1} = \ & V^{\text{imm}}_{1,jt} + \frac{\phi_1\,\nu_{1,jt}\,S_{jt}}{\left(S_{jt}+E_{jt}\right)} - \omega_1\,V^{\text{imm}}_{1,jt} - \frac{\nu_{2,jt}\,V^{\text{imm}}_{1,jt}}{\left(V^{\text{imm}}_{1,jt}+V^{\text{sus}}_{1,jt}\right)} - d_{jt}\,V^{\text{imm}}_{1,jt}\\[1mm] V^{\text{sus}}_{1,j,t+1} = \ & V^{\text{sus}}_{1,jt} + \frac{\left(1-\phi_1\right)\,\nu_{1,jt}\,S_{jt}}{\left(S_{jt}+E_{jt}\right)} + \omega_1\,V^{\text{imm}}_{1,jt} - \left( \Lambda^{V_1}_{j,t+1} + \Psi^{V_1}_{j,t+1} \right) - \frac{\nu_{2,jt}\,V^{\text{sus}}_{1,jt}}{\left(V^{\text{imm}}_{1,jt}+V^{\text{sus}}_{1,jt}\right)} - d_{jt}\,V^{\text{sus}}_{1,jt}\\[1mm] V^{\text{inf}}_{1,j,t+1} = \ & V^{\text{inf}}_{1,jt} + \left( \Lambda^{V_1}_{j,t+1} + \Psi^{V_1}_{j,t+1} \right) - d_{jt}\,V^{\text{inf}}_{1,jt} \quad \mathbf{\text{(tracking only)}}\\[6mm] \mathbf{\text{Two-dose vaccination:}}\\[3mm] V^{\text{imm}}_{2,j,t+1} = \ & V^{\text{imm}}_{2,jt} + \phi_2\,\nu_{2,jt} - \omega_2\,V^{\text{imm}}_{2,jt} - d_{jt}\,V^{\text{imm}}_{2,jt}\\[3mm] V^{\text{sus}}_{2,j,t+1} = \ & V^{\text{sus}}_{2,jt} + \left(1-\phi_2\right)\,\nu_{2,jt} + \omega_2\,V^{\text{imm}}_{2,jt} - \left( \Lambda^{V_2}_{j,t+1} + \Psi^{V_2}_{j,t+1} \right) - d_{jt}\,V^{\text{sus}}_{2,jt}\\[3mm] V^{\text{inf}}_{2,j,t+1} = \ & V^{\text{inf}}_{2,jt} + \left( \Lambda^{V_2}_{j,t+1} + \Psi^{V_2}_{j,t+1} \right) - d_{jt}\,V^{\text{inf}}_{2,jt} \quad \mathbf{\text{(tracking only)}}\\[6mm] \mathbf{\text{Infection dynamics:}}\\[3mm] E_{j,t+1} = \ & E_{jt} + \left( \Lambda_{j,t+1} + \Psi_{j,t+1}\right) - \iota\,E_{jt} - d_{jt}\,E_{jt}\\[3mm] I_{1,j,t+1} = \ & I_{1,jt} + \sigma\,\iota\,E_{jt} - \gamma_1\,I_{1,jt} - \mu_j\,I_{1,jt} - d_{jt}\,I_{1,jt}\\[3mm] I_{2,j,t+1} = \ & I_{2,jt} + \left(1-\sigma\right)\,\iota\,E_{jt} - \gamma_2\,I_{2,jt} - d_{jt}\,I_{2,jt}\\[3mm] R_{j,t+1} = \ & R_{jt} + \left( \gamma_1\,I_{1,jt} + \gamma_2\,I_{2,jt} \right) - \varepsilon\,R_{jt} - d_{jt}\,R_{jt}\\[5mm] \mathbf{\text{Environment:}}\\[3mm] W_{j,t+1} = \ & W_{jt} + \left(1-\theta_j\right)\left( \zeta_1\,I_{1,jt} + \zeta_2\,I_{2,jt} \right) - \delta_{jt}\,W_{jt}\\[3mm] \end{aligned} \tag{3.1} \end{equation}\]

For detailed descriptions of all parameters appearing in Equation (3.1), see the Table of model parameters. Transmission dynamics in the model are governed primarily by two distinct force-of-infection terms: the human-to-human force of infection, $\Lambda_{jt}$, and the environmental force of infection, $\Psi_{jt}$.

The human-to-human transmission component at time $t+1$ in location $j$ is defined separately for susceptible ($S$), one-dose vaccinated ($V_1$), and two-dose vaccinated ($V_2$) individuals as:

\[\begin{equation} \begin{aligned} \Lambda^S_{j,t+1} &= \frac{ \beta_{jt}^{\text{hum}} \, (1-\tau_{j})S_{jt} \, \left[ (1-\tau_{j}) (I_{1,jt} + I_{2,jt}) + \sum_{\forall i \neq j} \pi_{ij}\tau_i(I_{1,jt} + I_{2,jt}) \right]^{\alpha_1}}{N_{jt}^{\alpha_2}},\\[4mm] \Lambda^{V_1}_{j,t+1} &= \frac{ \beta_{jt}^{\text{hum}} \, (1-\tau_{j})V^{\text{sus}}_{1,jt} \, \left[ (1-\tau_{j})(I_{1,jt} + I_{2,jt}) + \sum_{\forall i \neq j} \pi_{ij}\tau_i(I_{1,jt} + I_{2,jt}) \right]^{\alpha_1}}{N_{jt}^{\alpha_2}},\\[4mm] \Lambda^{V_2}_{j,t+1} &= \frac{ \beta_{jt}^{\text{hum}} \, (1-\tau_{j})V^{\text{sus}}_{2,jt} \, \left[ (1-\tau_{j})(I_{1,jt} + I_{2,jt}) + \sum_{\forall i \neq j} \pi_{ij}\tau_i(I_{1,jt} + I_{2,jt}) \right]^{\alpha_1}}{N_{jt}^{\alpha_2}}. \end{aligned} \tag{3.2} \end{equation}\]

The total human-to-human force of infection is then the sum of these three components:

\[\begin{equation} \Lambda_{j,t+1} = \Lambda^S_{j,t+1} + \Lambda^{V_1}_{j,t+1} + \Lambda^{V_2}_{j,t+1}. \tag{3.3} \end{equation}\]

In these equations, $\beta_{jt}^{\text{hum}}$ represents the rate of human-to-human transmission. Movement within and among metapopulations is governed by the parameter $\tau_i$, indicating the probability of departing origin location $i$, while $\pi_{ij}$ describes the relative probability of travel from origin $i$ to destination $j$ (see section on spatial dynamics). The terms $\Lambda^{S}_{jt}$, $\Lambda^{V_1}_{jt}$, and $\Lambda^{V_2}_{jt}$ explicitly partition the overall human-to-human force of infection into separate contributions from susceptible, one-dose vaccinated, and two-dose vaccinated individuals, linking directly to the compartmental structure of the model described by the system of difference equations.

The environmental force of infection ($\Psi_{jt}$), capturing environment-to-human transmission at location $j$ and time $t+1$, is also explicitly partitioned into susceptible ($S$), one-dose vaccinated ($V_1$), and two-dose vaccinated ($V_2$) compartments:

\[\begin{equation} \begin{aligned} \Psi^S_{j,t+1} &= \frac{\beta_{jt}^{\text{env}}\, (1-\tau_{j})S_{jt}\,(1-\theta_j)W_{jt}}{\kappa + W_{jt}},\\[4mm] \Psi^{V_1}_{j,t+1} &= \frac{\beta_{jt}^{\text{env}}\, (1-\tau_{j})V^{\text{sus}}_{1,jt}\,(1-\theta_j)W_{jt}}{\kappa + W_{jt}},\\[4mm] \Psi^{V_2}_{j,t+1} &= \frac{\beta_{jt}^{\text{env}}\, (1-\tau_{j})V^{\text{sus}}_{2,jt}\,(1-\theta_j)W_{jt}}{\kappa + W_{jt}}. \end{aligned} \tag{3.4} \end{equation}\]

The total environmental force of infection is then the sum of these three components:

\[\begin{equation} \Psi_{j,t+1} = \Psi^S_{j,t+1} + \Psi^{V_1}_{j,t+1} + \Psi^{V_2}_{j,t+1}. \tag{3.5} \end{equation}\]

Here, $\beta_{jt}^{\text{env}}$ denotes the rate of environment-to-human transmission, and $\theta_j$ is the proportion of the population at location $j$ with at least basic access to Water, Sanitation, and Hygiene (WASH). The environmental exposure is scaled by the concentration of V. cholerae (cells per mL) associated with a 50% probability of infection (Fung 2014). Additional details regarding environmental compartments, water reservoirs, and climatic factors influencing transmission

Note that all model processes are stochastic. Transition rates are converted to probabilities with the commonly used method based on the exponential waiting time distribution $p(t) = 1-e^{-rt}$ (see Ross 2007). Integer quantities are thus moved between model compartments at each time step according to a binomial process similar to the recovery of infected individuals $\gamma I_{jt}$:

\[\begin{equation} \frac{\partial R}{\partial t} \sim \text{Binom}(I_{jt}, 1-e^{-\gamma}). \end{equation}\]

For a detailed list of all stochastic transitions in the model, see the Table of stochastic transitions below.

3.2 Latency

An important feature of the SVEIWRS model is the inclusion of an exposed compartment $\left(E\right)$ , which captures the latent period between exposure and the onset of infectiousness. In our model, individuals who become infected first enter the $E$ compartment, where they remain for a period governed by the incubation period $\iota$, before progressing to the infectious compartments $I_1$ (severe symptomatic infection) or $I_2$ (mild and/or asymptomatic infection).

A systematic review by Azman et al (2013) estimated the median incubation period for cholera to be approximately $1.4 \ \text{days} \ (1.3–1.6 \ 95\% \text{CI})$. This relatively short latency is one of the key characteristic governing cholera dynamics and is critical for accurately capturing the rapid spatial spread observed during outbreaks.

3.3 Seasonality

Cholera transmission is seasonal and is typically associated with the rainy season, so both transmission rate terms $\beta_{jt}^{\text{*}}$ are temporally forced. For human-to-human transmission we used a sinusoidal mechanism as in Altizer et al 2006. Specifically, the function is a truncated sine-cosine form of the Fourier series with two harmonic features which has the flexibility to capture seasonal transmission dynamics driven by extended rainy seasons and/or biannual trends:

\[\begin{equation} \beta_{jt}^{\text{hum}} = \beta_{j0}^{\text{hum}} \left(1 + a_1 \cos\left(\frac{2\pi t}{p}\right) + b_1 \sin\left(\frac{2\pi t}{p}\right) + a_2 \cos\left(\frac{4\pi t}{p}\right) + b_2 \sin\left(\frac{4\pi t}{p}\right)\right). \tag{3.6} \end{equation}\]

Where, $\beta_{j0}^{\text{hum}}$ is the mean human-to-human transmission rate at location $j$ over all time steps. Seasonal dynamics are determined by the parameters $a_1$, $b_1$ and $a_2$, $b_2$ which gives the amplitude of the first and second waves respectively. The periodic cycle $p$ is 365, so the function controls the temporal variation in $\beta_{jt}^{\text{hum}}$ over each day of the year.

We estimated the parameters in the Fourier series ($a_1$, $b_1$, $a_2$, $b_2$) using the Levenberg–Marquardt algorithm in the minpack.lm R library. Given the lack of reported cholera case data for many countries in SSA and the association between cholera transmission and the rainy season, we leveraged seasonal precipitation data to help fit the Fourier wave function to all countries. We first gathered weekly precipitation values from 1994 to 2024 for 30 uniformly distributed points within each country from the Open-Meteo Historical Weather Data API. Then we fit the Fourier series to the weekly precipitation data and used these parameters as the starting values when fitting the model to the more sparse cholera case data.

Example of a grid of 30 uniformly distributed points within Mozambique (A). The scatterplot shows weekly summed precipitation values at those 30 grid points and cholera cases plotted on the same scale of the Z-Score which shows the variance around the mean in terms of the standard deviation. Fitted Fourier series fucntions are shown as blue (fit precipitation data) and red (fit to cholera case data) lines.

Figure 3.3: Example of a grid of 30 uniformly distributed points within Mozambique (A). The scatterplot shows weekly summed precipitation values at those 30 grid points and cholera cases plotted on the same scale of the Z-Score which shows the variance around the mean in terms of the standard deviation. Fitted Fourier series fucntions are shown as blue (fit precipitation data) and red (fit to cholera case data) lines.

For countries with no reported case data, we inferred seasonal dynamics using the fitted wave function of a neighboring country with available case data. The selected neighbor was chosen from the same cluster of countries (grouped hierarchically into four clusters based on precipitation seasonality using Ward’s method; see Figure 3.4) that had the highest correlation in seasonal precipitation with the country lacking case data. In the rare event that no country with reported case data was found within the same seasonal cluster, we expanded the search to the 10 nearest neighbors and continued expanding by adding the next nearest neighbor until a match was found.

A) Map showing the clustering of African countries based on their seasonal precipitation patterns (2014-2024). Countries are colored according to their cluster assignments, identified using hierarchical clustering. B) Fourier series fitted to weekly precipitation for each country. Each line plot shows the seasonal pattern for countries within a given cluster. Clusteres are used to infer the seasonal transmission dynamics for countries where there are no reported cholera cases.

Figure 3.4: A) Map showing the clustering of African countries based on their seasonal precipitation patterns (2014-2024). Countries are colored according to their cluster assignments, identified using hierarchical clustering. B) Fourier series fitted to weekly precipitation for each country. Each line plot shows the seasonal pattern for countries within a given cluster. Clusteres are used to infer the seasonal transmission dynamics for countries where there are no reported cholera cases.

Using the model fitting methods described above, and the cluster-based approach for inferring the seasonal Fourier series pattern in countries without reported cholera cases, we modeled the seasonal dynamics for all 40 countries in the MOSAIC framework. These dynamics are visualized in Figure 3.5, with the corresponding Fourier model coefficients presented in Table 3.1.

$Seasonal transmission patterns for all countries modeled in MOSAIC as modeled by the truncated Fourier series in Equation \@ref(eq:beta1). Blues lines give the Fourier series model fits for precipitation (1994-2024) and the red lines give models fits to reported cholera cases (2023-2024). For countries where reported case data were not available, the Fourier model was inferred by the nearest country with the most similar seasonal precipitation patterns as determined by the hierarchical clustering. Countries with inferred case data from neighboring locations are annotated in red. The X-axis represents the weeks of the year (1-52), while the Y-axis shows the Z-score of weekly precipitation and cholera cases.$

Figure 3.5: Seasonal transmission patterns for all countries modeled in MOSAIC as modeled by the truncated Fourier series in Equation (3.6). Blues lines give the Fourier series model fits for precipitation (1994-2024) and the red lines give models fits to reported cholera cases (2023-2024). For countries where reported case data were not available, the Fourier model was inferred by the nearest country with the most similar seasonal precipitation patterns as determined by the hierarchical clustering. Countries with inferred case data from neighboring locations are annotated in red. The X-axis represents the weeks of the year (1-52), while the Y-axis shows the Z-score of weekly precipitation and cholera cases.

Table 3.1: Table 3.2: Estimated coefficients for the truncated Fourier model in Equation (3.6) fit to countries with reported cholera cases. Model fits are shown in Figure 3.5.
	Fourier Coefficients
Country	$a_1$	$a_2$	$b_1$	$b_2$
Burundi	-0.42 (-0.52 to -0.32)	-0.3 (-0.4 to -0.21)	-0.06 (-0.16 to 0.04)	-0.22 (-0.32 to -0.12)
Cameroon	-0.97 (-1.15 to -0.78)	-0.08 (-0.26 to 0.1)	0.44 (0.27 to 0.62)	-0.71 (-0.89 to -0.53)
DRC	0.01 (-0.03 to 0.05)	-0.08 (-0.12 to -0.04)	0.23 (0.19 to 0.27)	-0.04 (-0.08 to 0)
Ethiopia	-0.42 (-0.47 to -0.36)	-0.12 (-0.17 to -0.06)	0.22 (0.16 to 0.27)	0.05 (0 to 0.11)
Ghana	-0.71 (-1.7 to 0.27)	1.31 (0.47 to 2.15)	-0.29 (-0.95 to 0.37)	-1.17 (-1.73 to -0.6)
Kenya	0.12 (-0.08 to 0.31)	-0.28 (-0.48 to -0.08)	0.93 (0.73 to 1.13)	0.25 (0.05 to 0.45)
Malawi	1.29 (1.06 to 1.53)	0.29 (0.05 to 0.53)	1.11 (0.87 to 1.35)	1.23 (0.99 to 1.47)
Mozambique	0.4 (0.23 to 0.57)	-0.7 (-0.87 to -0.53)	1.2 (1.03 to 1.37)	0.19 (0.02 to 0.36)
Niger	2.95 (1.4 to 4.51)	-3.42 (-4.63 to -2.2)	1.79 (0.8 to 2.79)	1.72 (0.94 to 2.51)
Nigeria	-0.25 (-0.39 to -0.11)	-0.3 (-0.43 to -0.16)	-0.91 (-1.05 to -0.77)	0.17 (0.04 to 0.31)
Somalia	-0.22 (-0.28 to -0.17)	-0.24 (-0.29 to -0.18)	0.91 (0.86 to 0.97)	-0.37 (-0.42 to -0.32)
South Africa	-2.33 (-3.43 to -1.22)	1.06 (0.06 to 2.07)	-2.72 (-3.74 to -1.71)	3.23 (2.1 to 4.36)
South Sudan	1.01 (0.73 to 1.29)	1.54 (1.3 to 1.78)	0.18 (-0.05 to 0.41)	0.02 (-0.22 to 0.26)
Tanzania	0.49 (0.37 to 0.61)	-0.13 (-0.25 to -0.02)	-0.6 (-0.71 to -0.48)	-0.29 (-0.41 to -0.18)
Togo	1.11 (0.77 to 1.46)	0.02 (-0.33 to 0.36)	-1 (-1.36 to -0.63)	-0.91 (-1.27 to -0.55)
Uganda	-0.17 (-0.56 to 0.22)	0.52 (0.13 to 0.9)	0.6 (0.22 to 0.99)	0.42 (0.04 to 0.81)
Zambia	1.53 (1.29 to 1.77)	0.88 (0.64 to 1.12)	0.67 (0.44 to 0.91)	0.78 (0.55 to 1.02)
Zimbabwe	0.99 (0.87 to 1.11)	0.34 (0.23 to 0.46)	0.54 (0.42 to 0.65)	0.12 (0 to 0.23)

3.4 Environmental transmission

Environmental transmission is a critical factor in cholera spread and consists of several key components: the rate at which infected individuals shed V. cholerae into the environment, the pathogen’s survival rate in environmental conditions, and the overall suitability of the environment for sustaining the bacteria over time.

To capture the impacts of climate-drivers on cholera transmission, we have included the parameter $\psi_{jt}$, which represents the current state of environmental suitability with respect to: i) the survival time of V. cholerae in the environment and, ii) the rate of environment-to-human transmission which contributes to the overall force of infection.

\[\begin{equation} \beta_{jt}^{\text{env}} = \beta_{j0}^{\text{env}} \Bigg(1 + \frac{\psi_{jt}-\bar\psi_j}{\bar\psi_j} \Bigg) \quad \text{and} \quad \bar\psi_j = \frac{1}{T} \sum_{t=1}^{T} \psi_{jt} \tag{3.7} \end{equation}\]

This formulation effectively scales the base environmental transmission rate $\beta_{jt}^{\text{env}}$ so that it varies over time according to the climatically driven model of suitability. Note that, unlike the the cosine wave function of $\beta_{jt}^{\text{hum}}$, this temporal term can increase or decrease over time following multi-annual cycles.

3.4.1 Suitability-dependent decay rate

Suitability also influences how long V. cholerae survives in the environment. The decay rate $\delta_{jt}$ is modeled as the inverse of survival time, which varies with suitability. This is defined as:

\[ \delta_{jt} = \frac{1}{\text{days}_{\text{short}} + f\big(\psi_{jt}\big) \cdot \big(\text{days}_{\text{long}} - \text{days}_{\text{short}}\big)}. \tag{3.8} \]

Where $\text{days}_{\text{short}}$ is the shortest survival time (e.g., 3 days) and $\text{days}_{\text{long}}$ is the longest survival time (e.g., 90 days). Suitability is mapped to V. cholerae decay rate through a transformation function $f(\psi_{jt})$ that scales suitability values using a cumulative Beta distribution and two shape parameters $s_1$ and $s_2$: $f\big(\psi_{jt}\big) = \text{pbeta}(\psi_{jt} \mid s_1, \, s_2)$.

The transformation $f\big(\psi_{jt}\big) \in [0, 1]$ enables a range of functional forms, including linear, convex, concave, sigmoidal, or arcsine responses to suitability. This flexibility ensures that survival dynamics can reflect a variety of empirically plausible relationships with environmental conditions which can be seen in Figure 3.6.

$Relationship between environmental suitability ($\psi_{jt}$) and the survival and decay rate of *V. cholerae* in the environment ($\delta_{jt}$). Curves represent four transformation types used to map suitability to survival time via the cumulative Beta distribution with different shape parameters ($s_1$, $s_2$). The primary y-axis shows survival time in days; the secondary y-axis shows the corresponding decay rate, defined as $\delta_{jt} = 1/\text{days}(\psi_{jt})$. Horizontal dashed lines indicate the bounds on survival time, from 3 days (low suitability) to 90 days (high suitability) in this example.$

Figure 3.6: Relationship between environmental suitability ($\psi_{jt}$) and the survival and decay rate of V. cholerae in the environment ($\delta_{jt}$). Curves represent four transformation types used to map suitability to survival time via the cumulative Beta distribution with different shape parameters ($s_1$, $s_2$). The primary y-axis shows survival time in days; the secondary y-axis shows the corresponding decay rate, defined as $\delta_{jt} = 1/\text{days}(\psi_{jt})$. Horizontal dashed lines indicate the bounds on survival time, from 3 days (low suitability) to 90 days (high suitability) in this example.

3.4.2 Modeling environmental suitability

3.4.2.1 Environmental data

The mechanism for environment-to-human transmission (Equation (3.7)) and rate of decay of V. cholerae in the environment (Equation (3.8)) is driven by the parameter $\psi_{jt}$, which we refer to as environmental suitability. The parameter $\psi_{jt}$ is modeled as a time series for each location using a Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) model and a suite of 24 covariates which include 19 historical and forecasted climate variables under the MRI-AGCM3-2-S climate model. Covariates also include 4 large-scale climate drivers such as the Indian Ocean Dipole Mode Index (DMI), and the El Niño Southern Oscillation (ENSO) from 3 different Pacific Ocean regions. We also included a location specific variable giving the mean elevation for each country. See example time series of climate variables from one country (Mozambique) in Figure 3.7 and DMI and ENSO variables in Figure 3.8. A list of all covariates and their sources can be seen in Table 3.3.

Note that while the 19 climate variables offer forecasts up to 2030 and beyond, the forecasts of the DMI and ENSO variables are limited to 5 months into the future. So, environmental suitability model predictions are currently limited to a 5 month time horizon but future iterations may allow for longer forecasts. Additional data sources will be integrated into subsequent versions of the suitability model. For instance, flood and cyclone data will likely be incorporated later, though not in the initial version of the model.

Climate data acquired from the OpenMeteo data API. Data were collected from 30 uniformly distributed points across each country and then aggregated to give weekly values of 17 climate variable from 1970 to 2030.

Figure 3.7: Climate data acquired from the OpenMeteo data API. Data were collected from 30 uniformly distributed points across each country and then aggregated to give weekly values of 17 climate variable from 1970 to 2030.

Historical and forecasted values of the Indian Ocean Dipole Mode Index (DMI) and the El Niño Southern Oscillation (ENSO) from 2015 to 2025. The ENSO values come from three different regions: Niño3 (central to eastern Pacific), Niño3.4 (central Pacific), and Niño4 (western-central Pacifi). Data are from National Oceanic and Atmospheric Administration (NOAA) and Bureau of Meteorology (BOM).

Figure 3.8: Historical and forecasted values of the Indian Ocean Dipole Mode Index (DMI) and the El Niño Southern Oscillation (ENSO) from 2015 to 2025. The ENSO values come from three different regions: Niño3 (central to eastern Pacific), Niño3.4 (central Pacific), and Niño4 (western-central Pacifi). Data are from National Oceanic and Atmospheric Administration (NOAA) and Bureau of Meteorology (BOM).

Table 3.3: A full list of covariates and their sources used in the LSTM RNN model to predict the environmental suitability of *V. cholerae* ($\psi_{jt}$).
Covariate	Description	Source
temperature_2m_mean	Average temperature at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
temperature_2m_max	Maximum temperature at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
temperature_2m_min	Minimum temperature at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
wind_speed_10m_mean	Average wind speed at 10 meters	OpenMeteo Historical Weather and Climate Change APIs
wind_speed_10m_max	Maximum wind speed at 10 meters	OpenMeteo Historical Weather and Climate Change APIs
cloud_cover_mean	Mean cloud cover	OpenMeteo Historical Weather and Climate Change APIs
shortwave_radiation_sum	Total shortwave radiation	OpenMeteo Historical Weather and Climate Change APIs
relative_humidity_2m_mean	Mean relative humidity at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
relative_humidity_2m_max	Maximum relative humidity at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
relative_humidity_2m_min	Minimum relative humidity at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
dew_point_2m_mean	Mean dew point at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
dew_point_2m_min	Minimum dew point at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
dew_point_2m_max	Maximum dew point at 2 meters	OpenMeteo Historical Weather and Climate Change APIs
precipitation_sum	Total precipitation	OpenMeteo Historical Weather and Climate Change APIs
pressure_msl_mean	Mean sea level pressure	OpenMeteo Historical Weather and Climate Change APIs
soil_moisture_0_to_10cm_mean	Mean soil moisture at 0 to 10 cm	OpenMeteo Historical Weather and Climate Change APIs
et0_fao_evapotranspiration_sum	Total evapotranspiration (FAO method)	OpenMeteo Historical Weather and Climate Change APIs
DMI	Dipole Mode Index (DMI)	NOAA and BOM
ENSO3	El Niño Southern Oscillation (ENSO) - Region 3	NOAA and BOM
ENSO34	ENSO - Region 3.4	NOAA and BOM
ENSO4	ENSO - Region 4	NOAA and BOM
elevation	Mean elevation	Amazon Web Services Terrain Tiles

3.4.2.2 Deep learning neural network model

As mentioned above, we model environmental suitability $\psi_{jt}$ using a Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) model. The LSTM model was developed using keras and tensorflow in R to predict binary outcomes. Thus the modeled quantity $\psi_{jt}$ is a proportion implying unsuitable conditions at 0 and perfectly suitable conditions at 1.

The model was fitted to reported case counts that were converted to a binary variable using a threshold of 200 reported cases per week. Given delays in reporting and likely lead times for environmental suitability ahead of transmission and case reporting, we also set the preceding one week to be suitable and in cases where there were two consecutive weeks of >200 cases per week, we assumed that the preceding two weeks were also suitable. See Figure 3.9 for an example of how reported case counts are converted to a binary variable representing presumed environmental suitability for V. cholerae.

Figure 3.9: Reported cases converted to binary variable for modeling environmental suitability.

The model is a Long Short-Term Memory (LSTM) neural network designed for binary classification, where environmental suitability, $\psi_{jt}$, is modeled as a function of the hidden state $h_t$ and hidden bias term $b_h$. Specifically, $\psi_{jt}$ is defined by a sigmoid activation function applied to the linear combination of the hidden state $h_t$ and the bias $b_h$ which if given by the 3 layers of the LSTM model:

\[\begin{equation} \psi_{jt} \sim \text{Sigmoid}(w_h \cdot h_t + b_h) \tag{3.9} \end{equation}\]

\[\begin{equation} h_t = \text{LSTM}\big(\text{temperature}_{jt}, \ \text{precipitation}_{jt}, \ \text{ENSO}_{t}, \dots \big) \end{equation}\]

In this formulation, $h_t$ represents the hidden state generated by the LSTM network based on input variables such as temperature, precipitation, and ENSO conditions, while $b_h$ is a bias term added to the output of the hidden state transformation.

The deep learning LSTM model consists of three stacked LSTM-RNN layers. The first LSTM layer has 500 units and the second and third LSTM layers have 250 and 100 units respectively. The architecture the LSTM model is configured to pass node values to subsequent LSTM layers allowing deep learning of more the complex interactions among the climate variable over time. We enforced model sparsity for each LSTM layer using L2 regularization (penalty = 0.001) and used a dropout rate of 0.5 for each LSTM layer to further prevent overfitting on the limited amount of data. The final output layer was a dense layer with a single unit and a sigmoid activation function to produce a probability value for binary classification, i.e. a prediction of environmental suitability $\psi_{jt}$ on a scale of 0 to 1.

To fit the LSTM model to data, we modified the learning rate by applying an exponential decay schedule that started at 0.001 and decayed by a factor of 0.9 every 10,000 steps to enable smoother convergence. The model was compiled using the Adam optimizer with this learning rate schedule, along with binary cross-entropy as the loss function and accuracy as the evaluation metric. The model was trained for a maximum of 200 epochs with a batch size of 1024. We allowed model fitting to stop early with a patience parameter of 10 which halts training if no improvement is observed in validation accuracy for 10 consecutive epochs. To train the model we set aside 20% of the observed data for validation and also used 20% of the training data for model fitting. The training history, including loss and accuracy, was monitored over the course of training and gave a final test accuracy of 0.73 and a final test loss of 0.56 (see Figure 3.10).

Figure 3.10: Model performance on training and validation data.

After model training was completed, we predicted the values of environmental suitability $\psi_{jt}$ across all time steps for each location. Predictions start in January 1970 and go up to 5 months past the present date (currently February 2025). Given the amount of noise in the model predictions, we added a simple LOESS spline with logit transformation to smooth model predictions over time and give a more stable value of $\psi_{jt}$ when incorporating it into other model features (e.g. Equations (3.7) and (3.8)). The resulting model predictions are shown for an example country such as Mozambique in Figure 3.11 which compares model predictions to the original case counts and the binary classification. Predicitons for all model locations are shown in a simplified view in Figure 3.12.

Also, please note that this initial version of the model is fitted to a rather small amount of data. Model hyper parameters were specifically chosen to reduce overfitting. Therefore, we recommend to not over-interpret the time series predictions of the model at this early stage since they are likely to change and improve as more historical incidence data is included in future versions.

The LSTM model predictions over time and reported cases for an example country such as Mozambique. Reported cases are shown in the top panel and tje shaded areas show the binary classification used to characterize environmental suitability. Raw model predicitons are shown in the transparent brown line with the solid black line showing the LOESS smoothing. Forecasted values beyond the current time point are shown in orange and are limited to 5 month time horizon.

Figure 3.11: The LSTM model predictions over time and reported cases for an example country such as Mozambique. Reported cases are shown in the top panel and tje shaded areas show the binary classification used to characterize environmental suitability. Raw model predicitons are shown in the transparent brown line with the solid black line showing the LOESS smoothing. Forecasted values beyond the current time point are shown in orange and are limited to 5 month time horizon.

The smoothed LSTM model predictions (lines) and binary suitability classification (shaded areas) over time for all countries in the MOSAIC framework. Orange lines show forecasts beyond the current date. With ENSO and DMI covariates included in the model, forecasts are limited to 5 months.

Figure 3.12: The smoothed LSTM model predictions (lines) and binary suitability classification (shaded areas) over time for all countries in the MOSAIC framework. Orange lines show forecasts beyond the current date. With ENSO and DMI covariates included in the model, forecasts are limited to 5 months.

3.4.3 Shedding of V. cholerae

The rate at which infected individuals shed Vibrio cholerae into the environment is a critical factor influencing cholera transmission dynamics. Shedding rates vary widely depending on the severity of infection, the host immune response, and environmental conditions. To reflect this heterogeneity, the model distinguishes between two types of infected individuals:

Symptomatic individuals ($I_1$), who tend to shed substantially more bacteria for longer due to more severe gastrointestinal symptoms;
Asymptomatic individuals ($I_2$), who shed less per capita and for a shorter period of time, but may contribute significantly to environmental contamination due to their larger numbers.

According to the modeling study done by Fung et al. (2014), estimates of V. cholerae shedding across the population can range from 0.01 to 10 cells per mL per person per day. However, this estimate does not fully capture the range of possible shedding that can occur depending on the type of infection. In contrast, Nelson et al. (2009) report that individuals may shed between $10^3$ $\text{cells}~\text{g}^{-1}~\text{stool}$ in asymptomatic cases and up to $10^{12}$ $\text{cells}~\text{g}^{-1}~\text{stool}$ in severe symptomatic infections. While these quantities are slightly different from the $\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}$ units used in cholera transmission models, it implies that symptomatic individuals may shed several orders of magnitude more bacteria into the environment per day than asymptomatic individuals.

To account for the uncertainty in levels of V. cholerae shedding, we collated a short list of studies that either report empirical findings or modeling analyses that set priors for shedding parameters. These sources reflect a wide range of assumptions and contexts, but nonetheless provide a spectrum of estimated V. cholerae shedding rates that we can use to inform our model (see the table of shedding parameters below below).

We currently assume that shedding rates are constant and drawn from independent uniform distributions in units of $\mathbf{\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}}$, which is consistent with the frequently cited sources for shedding rates of Codeço 2001 and others:

\[ \begin{aligned} \zeta_1 \ \sim \ &\text{Uniform}(10^4,\ 10^{8}) \quad \mathbf{\text{(symptomatic shedding)}},\\ \zeta_2 \ \sim \ &\text{Uniform}(0.01,\ 10^3) \quad \mathbf{\text{(asymptomatic shedding)}}. \end{aligned} \tag{3.10} \]

The definition of these priors assumes that:

the watery stool of infected individuals has approximately the same density as water (1kg/L), such that $10^5 \text{cells}~\text{g}^{-1}\text{day}^{-1} \approx 10^5 \text{cells}~\text{mL}^{-1}\text{person}^{-1}\text{day}^{-1}$, and
shedding in symptomatic individuals is always greater than that of asymptomatic individuals with the potential to be many orders of magnitude greater.

These priors also reflect the observed variability in the literature while preserving identifiability in model fitting. The upper bound for symptomatic shedding ($10^8$) is conservative relative to extreme values (e.g., $10^{12}$ cells/L in rice water stool), but comfortably spans values seen in both clinical observations and theoretical models. The lower bound ($10^4$) ensures that small but still epidemiologically significant shedding is captured. For asymptomatic individuals, the range of 0.01 to $10^3$ $\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}$ captures both low empirical estimates (e.g. Mosley et al. 1968) and broader assumptions made in cholera transmission models (e.g. Codeço 2001). The range of these priors therefore provides sufficient flexibility to represent both high-intensity shedding in severe cases and low-level contributions distributed across a larger number of asymptomatic individuals.

The table below summarizes key published estimates and assumptions regarding V. cholerae and related bacterial shedding rates:

Value(s)	Units	Infection	Description	Source
$10^3$	$\text{cells}~\text{g}^{-1}~\text{stool}$	Asymptomatic	Approx. 1 day of shedding at ~10³ vibrios per gram of stool	Mosley et al. (1968)
$10^6$–$10^9$	$\text{cells}~\text{g}^{-1}~\text{stool}$	NA	Number of fecal coliform indicator bacteria in human feces	Feachem et al. (1983)
$1$–$100$	$\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}$	All	Point estimate of 10; range 1–100 used in sensitivity analysis	Codeço (2001)
$10$	$\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}$	All	Assumed shedding rate used in epidemic model incorporating hyperinfectivity	Hartley et al. (2006)
$\leq 10^5$	$\text{cells}~\text{g}^{-1}~\text{stool}$	Asymptomatic	No symptoms; low-level shedding of vibrios	Nelson et al. (2009)
$\leq 10^8$	$\text{cells}~\text{g}^{-1}~\text{stool}$	Mild	Diarrhoea with moderate vibrios in stool	Nelson et al. (2009)
$10^7$–$10^9$	$\text{cells}~\text{g}^{-1}~\text{stool}$	Severe	Vomiting and profuse diarrhoea with high shedding	Nelson et al. (2009)
$10^{10}$–$10^{12}$	$\text{cells}~\text{L}^{-1}~\text{stool}$	Severe	Concentration in rice water stool from symptomatic individuals	Nelson et al. (2009)
$0.01$–$10$	$\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}$	All	Reported as general estimate across all infections	Fung (2014)
$10$–$100$	$\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}$	All	Represents shedding rates in two distinct sub-populations	Njagarah & Nyabadza (2014)

3.4.4 Recovery rates

The recovery rates in the MOSAIC model are defined as the inverse of the shedding duration for infected individuals. This reflects the period during which individuals contribute to the environmental load of Vibrio cholerae, regardless of the presence of clinical symptoms. The model distinguishes between:

Symptomatic individuals ($\gamma_1$):
Individuals in the $I_1$ compartment typically experience acute watery diarrhea and may shed large quantities of V. cholerae for several days. Clinical studies and reviews suggest that symptomatic patients shed vibrios for approximately 3 to 5 days, with shedding sometimes persisting up to 14 days (Nelson et al. 2009, Harris et al. 2012). Based on these estimates, we define the recovery rate as a uniform distribution over plausible durations:

\[ \gamma_1 \sim \text{Uniform}(1/7,\ 1/3) \quad \text{day}^{-1} \]

Asymptomatic individuals ($\gamma_2$):
Asymptomatic individuals in the $I_2$ compartment may not show clinical symptoms but can still shed V. cholerae for extended periods. Observational studies indicate shedding can persist for 7 to 14 days, and potentially longer (Mosley et al. 1968, Public Health Ontario 2022). To capture this range, we define the asymptomatic recovery rate as:

\[ \gamma_2 \sim \text{Uniform}(1/14,\ 1/7) \quad \text{day}^{-1} \]

In cases where point estimates are preferred or required for model fitting, we use the mean values of each distribution:

\[ \gamma_1 = \frac{1}{5} = 0.2 \ \text{day}^{-1}, \qquad \gamma_2 = \frac{1}{10} = 0.1 \ \text{day}^{-1} \]

These parameterizations reflect the empirical difference in shedding durations by symptom status and are consistent with previous cholera transmission models
(e.g., Codeço 2001). They also align with the structure of the environmental shedding process and the $I_1$, $I_2$ compartments in the model.

$Estimated shedding duration (x-axis) for symptomatic and asymptomatic *V. cholerae* infections. Shaded bars indicate the assumed range of plausible durations; solid vertical lines mark the mean value for each group. These durations are used to derive recovery rates ($\gamma_1$ and $\gamma_2$) as the inverse of duration and parameterize the infectious period in the MOSAIC transmission model.$

Figure 3.13: Estimated shedding duration (x-axis) for symptomatic and asymptomatic V. cholerae infections. Shaded bars indicate the assumed range of plausible durations; solid vertical lines mark the mean value for each group. These durations are used to derive recovery rates ($\gamma_1$ and $\gamma_2$) as the inverse of duration and parameterize the infectious period in the MOSAIC transmission model.

3.4.5 WAter, Sanitation, and Hygiene (WASH)

Since V. cholerae is transmitted through fecal contamination of water and other consumables, the level of exposure to contaminated substrates significantly impacts transmission rates. Interventions involving Water, Sanitation, and Hygiene (WASH) have long been a first line of defense in reducing cholera transmission, and in this context, WASH variables can serve as proxy for the rate of contact with environmental risk factors. In the MOSAIC model, WASH variables are incorporated mechanistically, allowing for intervention scenarios that include changes to WASH. However, it is necessary to distill available WASH variables into a single parameter that represents the WASH-determined contact rate with contaminated substrates for each location $j$, which we define as $\theta_j$.

To parameterize $\theta_j$, we calculated a weighted mean of the 8 WASH variables in Sikder et al 2023 and originally modeled by the Local Burden of Disease WaSH Collaborators 2020. The 8 WASH variables (listed in Table 3.4) provide population-weighted measures of the proportion of the population that either: i) have access to WASH resources (e.g., piped water, septic or sewer sanitation), or ii) are exposed to risk factors (e.g. surface water, open defecation). For risk associated WASH variables, we used the complement ($1-\text{value}$) to give the proportion of the population not exposed to each risk factor. We used the optim function in R and the L-BFGS-B algorithm to estimate the set of optimal weights (Table 3.4) that maximize the correlation between the weighted mean of the 8 WASH variables and reported cholera incidence per 1000 population across 40 SSA countries from 2000 to 2016. The optimal weighted mean had a correlation coefficient of $r =$ -0.33 (-0.51 to -0.09 95% CI) which was higher than the basic mean and all correlations provided by the individual WASH variables (see Figure 3.14). The weighted mean then provides a single variable between 0 and 1 that represents the overall proportion of the population that has access to WASH and/or is not exposed to environmental risk factors. Thus, the WASH-mediated contact rate with sources of environmental transmission is represented as ($1-\theta_j$) in the environment-to-human force of infection ($\Psi_{jt}$). Values of $\theta_j$ for all countries are shown in Figure 3.15.

Table 3.4: Table 3.5: Table of optimized weights used to calculate the single mean WASH index for all countries.
WASH variable	Optimized weight
Piped Water	0.356
Septic or Sewer Sanitation	0.014
Other Improved Water	0.000
Other Improved Sanitation	0.000
Surface Water	0.504
Unimproved Sanitation	0.000
Unimproved Water	0.000
Open Defecation	0.126

Figure 3.14: Relationship between WASH variables and cholera incidences.

The optimized weighted mean of WASH variables for AFRO countries. Countries labeled in orange denote countries with an imputed weighted mean WASH variable. Imputed values are the weighted mean from the 3 most similar countries.

Figure 3.15: The optimized weighted mean of WASH variables for AFRO countries. Countries labeled in orange denote countries with an imputed weighted mean WASH variable. Imputed values are the weighted mean from the 3 most similar countries.

3.5 Immune dynamics

Aside from the current number of infections, population susceptibility is one of the key factors influencing the spread of cholera. Further, since immunity from both vaccination and natural infection provides long-lasting protection, it’s crucial to quantify not only the incidence of cholera but also the number of past vaccinations. Additionally, we need to estimate how many individuals with immunity remain in the population at any given time step in the model.

To achieve this, we estimate the vaccination rate over time ($\nu_{jt}$) based on historical vaccination campaigns and incorporate a model of vaccine effectiveness ($\phi$) and immune decay post-vaccination ($\omega$) to estimate the current number of individuals with vaccine-derived immunity. We also account for the immune decay rate from natural infection ($\varepsilon$), which is generally considered to last longer than immunity from vaccination.

3.5.1 Estimating Vaccination Rates

To estimate the past and current vaccination rates, we sourced data on reported OCV vaccinations from the WHO International Coordinating Group (ICG) Cholera vaccine dashboard. This resource lists all reactive OCV campaigns conducted from 2016 to the present, with approximately 103 million OCV doses shipped to Sub-Saharan African (SSA) countries as of October 9, 2024. However, these data only capture reactive vaccinations in emergency settings and do not include preventive campaigns organized by GAVI and in-country partners.

As a result, our current estimates of the OCV vaccination rate likely underestimate total OCV coverage. We are working to expand our data sources to better reflect the full number of OCV doses distributed in SSA and will update the results here as soon as these are available.

To translate the reported number of OCV doses into the model parameter $\nu_{jt}$, we take the number of doses shipped and the reported start date of the vaccination campaign, distributing the doses over subsequent days according to a maximum daily vaccination rate. Therefore, the vaccination rate $\nu_t$ is not an estimated quantity, it is defined by the reported number of OCV doses administered with a assumption about the daily rate of distribution for an OCV campaign:

\[ \nu_{jt} = f\big(\text{reported OCV doses distributed}_{jt} \ | \ \text{daily distribution rate}\big). \]

See Figure 3.16 for an example of OCV distribution using a maximum daily vaccination rate of 100,000. The resulting time series for each country is shown in Figure 3.17, with current totals based on the WHO ICG data displayed in Figure 3.18.

Figure 3.16: Example of the estimated vaccination rate during an OCV campaign.

Figure 3.17: The estimated vaccination coverage across all countries with reported vaccination data one the WHO ICG dashboard.

Figure 3.18: The total cumulative number of OCV doses distributed through the WHO ICG from 2016 to present day.

3.5.2 Immunity from vaccination

The impacts of Oral Cholera Vaccine (OCV) campaigns is incorporated into the model through the Vaccinated compartment (V). The rate that individuals are effectively vaccinated is defined as $\phi\nu_t$, where $\nu_t$ is the number of OCV doses administered in location $j$ at time $t$ and $\phi$ is the estimated vaccine effectiveness. The vaccination rate $\nu_{jt}$ is not an estimated quantity. Rather, it is directly defined by the reported number of OCV doses administered as described above. Note that there is just one vaccinated compartment at this time, though future model versions may include $V_1$ an $V_2$ compartments to explore two dose vaccination strategies or to emulate more complex waning patterns.

The evidence for waning immunity comes from 4 cohort studies (Table 3.6) from Bangladesh (Qadri et al 2016 and 2018), South Sudan (Azman et al 2016), and Democratic Republic of Congo (Malembaka et al 2024).

Table 3.6: Summary of Effectiveness Data
Effectiveness	Upper CI	Lower CI	Day (midpoint)	Day (min)	Day (max)	Source
60.0	0.873	0.990	0.702	NA	NA	Azman et al (2016)
93.5	0.400	0.600	0.110	7	180	Qadri et al (2016)
368.5	0.390	0.520	0.230	7	730	Qadri et al (2018)
435.0	0.527	0.674	0.314	360	510	Malembaka et al (2024)
900.0	0.447	0.594	0.248	720	1080	Malembaka et al (2024)

We estimated vaccine effectiveness and waning immunity by fitting an exponential decay model to the reported effectiveness of one dose OCV in these studies using the following formulation:

\[\begin{equation} \text{Proportion immune}\ t \ \text{days after vaccination} = \phi \times (1 - \omega) ^ {t-t_{\text{vaccination}}} \tag{3.11} \end{equation}\]

Where $\phi$ is the effectiveness of one dose OCV, and the based on this specification, it is also the initial proportion immune directly after vaccination. The decay rate parameter $\omega$ is the rate at which initial vaccine derived immunity decays per day post vaccination, and $t$ and $t_{\text{vaccination}}$ are the time (in days) the function is evaluated at and the time of vaccination respectively. When we fitted the model to the data from the cohort studies shown in Table (3.6) we found that $\omega = 0.00057$ ($0-0.0019$ 95% CI), which gives a mean estimate of 4.8 years for vaccine derived immune duration with unreasonably large confidence intervals (1.4 years to infinite immunity). However, the point estimate of 4.8 years is consistent with anecdotes that one dose OCV is effective for up to at least 3 years.

The wide confidence intervals are likely due to the wide range of reported estimates for proportion immune after a short duration in the 7–90 days range (Azman et al 2016 and Qadri et al 2016). Therefore, we chose to use the point estimate of $\omega$ and incorporate uncertainty based on the initial proportion immune (i.e. vaccine effectiveness $\phi$) shortly after vaccination. Using the decay model in Equation (3.11) we estimated $\phi$ to be $0.64$ ($0.32-0.96$ 95% CI). We then fit a Beta distribution to the quantiles of $\phi$ by minimizing the sums of squares using the Nelder-Mead optimization algorithm to render the following distribution (shown in Figure 3.19B):

\[\begin{equation} \phi \sim \text{Beta}(4.57, 2.41). \tag{3.12} \end{equation}\]

Figure 3.19: This is vaccine effectiveness

3.5.3 Immunity from natural infection

The duration of immunity after a natural infection is likely to be longer lasting than that from vaccination with OCV (especially given the current one dose strategy). As in most SIR-type models, the rate at which individuals leave the Recovered compartment is governed by the immune decay parameter $\varepsilon$. We estimated the durability of immunity from natural infection based on two cohort studies and fit the following exponential decay model to estimate the rate of immunity decay over time:

\[ \text{Proportion immune}\ t \ \text{days after infection} = 0.99 \times (1 - \varepsilon) ^ {t-t_{\text{infection}}} \] Where we make the necessary and simplifying assumption that within 0–90 days after natural infection with V. cholerae, individuals are 95–99% immune. We fit this model to reported data from Ali et al (2011) and Clemens et al (1991) (see Table 3.7).

Table 3.7: Sources for the duration of immunity fro natural infection.
Day	Effectiveness	Upper CI	Lower CI	Source
90	0.95	0.95	0.95	Assumption
1080	0.65	0.81	0.37	Ali et al (2011)
1260	0.61	0.81	0.21	Clemens et al (1991)

We estimated the mean immune decay to be $\bar\varepsilon = 3.9 \times 10^{-4}$ ($1.7 \times 10^{-4}-1.03 \times 10^{-3}$ 95% CI) which is equivalent to an immune duration of $7.21$ years ($2.66-16.1$ years 95% CI) as shown in Figure 3.20A. This is slightly longer than previous modeling work estimating the duration of immunity to be ~5 years (King et al 2008). Uncertainty around $\varepsilon$ in the model is then represented by a Log-Normal distribution as shown in Figure 3.20B:

\[ \varepsilon \sim \text{Lognormal}(\bar\varepsilon+\frac{\sigma^2}{2}, 0.25) \]

Figure 3.20: The duration of immunity after natural infection with V. cholerae.

3.6 Spatial dynamics

The parameters in the model diagram in Figure 3.2 that have a $jt$ subscript denote the spatial structure of the model. Each country is modeled as an independent metapopulation that is connected to all others via the spatial force of infection $\Lambda_{jt}$ which moves contagion among metapopulations according to the connectivity provided by parameters $\tau_i$ (the probability departure) and $\pi_{ij}$ (the probability of diffusion to destination $j$). Both parameters are estimated using the departure-diffusion model below which is fitted to average weekly air traffic volume between all of the 41 countries included in the MOSAIC framework (Figure 3.21).

Figure 3.21: The average number of air passengers per day in 2017 among all countries.

Figure 3.22: A network map showing the average number of air passengers per day in 2017.

3.6.1 Human mobility model

The departure-diffusion model estimates diagonal and off-diagonal elements in the mobility matrix ($M$) separately and combines them using conditional probability rules. The model first estimates the probability of travel outside the origin location $i$—the departure process—and then the distribution of travel from the origin location $i$ by normalizing connectivity values across all $j$ destinations—the diffusion process. The values of $\pi_{ij}$ sum to unity along each row, but the diagonal is not included, indicating that this is a relative quantity. That is to say, $\pi_{ij}$ gives the probability of going from $i$ to $j$ given that travel outside origin $i$ occurs. Therefore, we can use basic conditional probability rules to define the travel routes in the diagonal elements (trips made within the origin $i$) as \[ \Pr( \neg \text{depart}_i ) = 1 - \tau_i \] and the off-diagonal elements (trips made outside origin $i$) as \[ \Pr( \text{depart}_i, \text{diffuse}_{i \rightarrow j}) = \Pr( \text{diffuse}_{i \rightarrow j} \mid \text{depart}_i ) \Pr(\text{depart}_i ) = \pi_{ij} \tau_i. \] The expected mean number of trips for route $i \rightarrow j$ is then:

\[\begin{equation} M_{ij} = \begin{cases} \theta N_i (1-\tau_i) \ & \text{if} \ i = j \\ \theta N_i \tau_i \pi_{ij} \ & \text{if} \ i \ne j. \end{cases} \tag{3.13} \end{equation}\]

Where, $\theta$ is a proportionality constant representing the overall number of trips per person in an origin population of size $N_i$, $\tau_i$ is the probability of leaving origin $i$, and $\pi_{ij}$ is the probability of travel to destination $j$ given that travel outside origin $i$ occurs.

3.6.2 Estimating the departure process

The probability of travel outside the origin is estimated for each location $i$ to give the location-specific departure probability $\tau_i$. \[ \tau_i \sim \text{Beta}(1+s, 1+r) \] Binomial probabilities for each origin $\tau_i$ are drawn from a Beta distributed prior with shape ($s$) and rate ($r$) parameters. \[ \begin{aligned} s &\sim \text{Gamma}(0.01, 0.01)\\ r &\sim \text{Gamma}(0.01, 0.01) \end{aligned} \]

3.6.3 Estimating the diffusion process

We use a normalized formulation of the power law gravity model to defined the diffusion process, the probability of travelling to destination $j$ given travel outside origin $i$ ($\pi_{ij}$) which is defined as:

\[\begin{equation} \pi_{ij} = \frac{ N_j^\omega d_{ij}^{-\gamma} }{ \sum\limits_{\forall j \ne i} N_j^\omega d_{ij}^{-\gamma} } \tag{3.14} \end{equation}\]

Where, $\omega$ scales the attractive force of each $j$ destination based on its population size $N_j$. The kernel function $d_{ij}^{-\gamma}$ serves as a penalty on the proportion of travel from $i$ to $j$ based on distance. Prior distributions of diffusion model parameters are defined as: \[ \begin{aligned} \omega &\sim \text{Gamma}(1, 1)\\ \gamma &\sim \text{Gamma}(1, 1) \end{aligned} \]

The models for $\tau_i$ and $\pi_{ij}$ were fitted to air traffic data from OAG using the mobility R package (Giles 2020). Estimates for mobility model parameters are shown in Figures 3.23 and 3.24.

$The estimated weekly probability of travel outside of each origin location $\tau_i$ and 95% confidence intervals is shown in panel A with the population mean indicated as a red dashed line. Panel B shows the estimated total number of travelers leaving origin $i$ each day.$

Figure 3.23: The estimated weekly probability of travel outside of each origin location $\tau_i$ and 95% confidence intervals is shown in panel A with the population mean indicated as a red dashed line. Panel B shows the estimated total number of travelers leaving origin $i$ each day.

$The diffusion process $\pi_{ij}$ which gives the estimated probability of travel from origin $i$ to destination $j$ given that travel outside of origin $i$ has occurred.$

Figure 3.24: The diffusion process $\pi_{ij}$ which gives the estimated probability of travel from origin $i$ to destination $j$ given that travel outside of origin $i$ has occurred.

3.6.4 The probability of spatial transmission

The likelihood of introductions of cholera from disparate locations is a major concern during cholera outbreaks. However, this can be difficult to characterize given the endemic dynamics and patterns of human movement. We include a few measures of spatial heterogeneity here and the first is a simple importation probability based on connectivity and the possibility of incoming infections. The basic probability of transmission from an origin $i$ to a particular destination $j$ and time $t$ is defined as:

\[\begin{equation} p(i,j,t) = 1 - e^{-\beta_{jt}^{\text{hum}} (((1-\tau_j)S_{jt})/N_{jt}) \pi_{ij}\tau_iI_{it}} \tag{3.15} \end{equation}\]

3.6.5 The spatial hazard

Although we are more concerned with endemic dynamics here, there are likely to be periods of time early in the rainy season where cholera cases and the rate of transmission is low enough for spatial spread to resemble epidemic dynamics for a time. During such times periods, we can estimate the arrival time of contagion for any location where cases are yet to be reported. We do this be estimating the spatial hazard of transmission:

\[\begin{equation} h(j,t) = \frac{ \beta_{jt}^{\text{hum}} \Big(1 - \exp\big(-((1-\tau_j)S_{jt}/N_{jt}) \sum_{\forall i \not= j} \pi_{ij}\tau_i (I_{it}/N_{it}) \big) \Big) }{ 1/\big(1 + \beta_{jt}^{\text{hum}} (1-\tau_j)S_{jt}\big) }. \tag{3.16} \end{equation}\]

And then normalizing to give the waiting time distribution for all locations:

\[\begin{equation} w(j,t) = h(j,T) \prod_{t=1}^{T-1}1-h(j,t). \tag{3.17} \end{equation}\]

3.6.6 Coupling among locations

Another measure of spatial heterogeneity is to quantify the coupling of disease dynamics among metapopulations using a correlation coefficient. Here, we use the definition of spatial correlation between locations $i$ and $j$ as $C_{ij}$ described in Keeling and Rohani (2002), which gives a measure of how similar infection dynamics are between locations.

\[\begin{equation} C_{ij} = \frac{ ( y_{it} - \bar{y}_i )( y_{jt} - \bar{y}_j ) }{ \sqrt{\text{var}(y_i) \text{var}(y_j)} } \tag{3.18} \end{equation}\] Where $y_{it} = I_{it}/N_i$ and $y_{jt} = I_{jt}/N_j$. Mean prevalence in each location is $\bar{y_i} = \frac{1}{T} \sum_{t=1}^{T} y_{it}$ and $\bar{y_j} = \frac{1}{T} \sum_{t=1}^{T} y_{jt}$.

3.7 The observation process

3.7.1 Rate of symptomatic infection

The presentation of infection with V. cholerae can be extremely variable. The severity of infection depends many factors such as the amount of the infectious dose, the age of the host, the level of immunity of the host either through vaccination or previous infection, and naivety to the particular strain of V. cholerae. Additional circumstantial factors such as nutritional status and overall pathogen burden may also impact infection severity. At the population level, the observed proportion of infections that are symptomatic is also dependent on the endemicity of cholera in the region. Highly endemic areas (e.g. parts of Bangladesh; Hegde et al 2024) may have a very low proportion of symptomatic infections due to many previous exposures. Inversely, populations that are largely naive to V. cholerae will exhibit a relatively higher proportion of symptomatic infections (e.g. Haiti; Finger et al 2024).

Accounting for all of these nuances in the first version of this model not possible, but we can past studies do contain some information that can help to set some sensible bounds on our definition for the proportion of infections that are symptomatic ($\sigma$). So we have compiled a short list of studies that have done sero-surveys and cohort studies to assess the likelihood of symptomatic infections in different locations and displayed those results in Table (3.8).

To provide a reasonably informed prior for the proportion of infections that are symptomatic, we calculated the combine mean and confidence intervals of all studies in Table 3.8 and fit a Beta distribution that corresponds to these quantiles using least-squares and a Nelder-Mead algorithm. The resulting prior distribution for the symptomatic proportion $\sigma$ is:

\[\begin{equation} \sigma \sim \text{Beta}(4.30, 13.51) \end{equation}\]

Table 3.8: Summary of Studies on Cholera Immunity
Mean	Low CI	High CI	Location	Source	Note
0.570	NA	NA	NA	Nelson et al (2009)	Review
NA	1.000	0.250	NA	Lueng & Matrajt (2021)	Review
NA	0.600	0.200	Endemic regions	Harris et al (2012)	Review
0.238	0.250	0.227	Haiti	Finger et al (2024)	Sero-survey and clinical data
0.213	0.231	0.194	Haiti	Jackson et al (2013)	Cross-sectional sero-survey
0.204	NA	NA	Pakistan	Bart et al (1970)	Sero-survey during epidemic; El Tor Ogawa strain
0.371	NA	NA	Pakistan	Bart et al (1970)	Sero-survey during epidemic; Inaba strain
0.184	0.256	0.112	Bangladesh	Harris et al (2008)	Household cohort; mean of all age groups
0.001	0.000	0.001	Bangladesh	Hegde et al (2024)	Sero-survey and clinical data

The prior distribution for $\sigma$ is plotted in Figure 3.25A with the reported values of the proportion symptomatic from previous studies shown in 3.25B.

Figure 3.25: Proportion of infections that are symptomatic.

3.7.2 Suspected cases

The clinical presentation of diarrheal diseases is often similar across various pathogens, which can lead to systematic biases in the reported number of cholera cases. It is anticipated that the number of suspected cholera cases is related to the actual number of infections by a factor of $1/\rho$, where $\rho$ represents the proportion of suspected cases that are true infections. To adjust for this bias, we use estimates from the meta-analysis by Weins et al. (2023), which suggests that suspected cholera cases outnumber true infections by approximately 2 to 1, with a mean across studies indicating that 52% (24-80% 95% CI) of suspected cases are actual cholera infections. A higher estimate was reported for ourbreak settings (78%, 40-99% 95% CI). To account for the variability in this estimate, we fit a Beta distribution to the reported quantiles using a least squares approach and the Nelder-Mead algorithm, resulting in the prior distribution shown in Figure 3.26B:

\[\begin{equation} \rho \sim \text{Beta}(4.79, 1.53). \end{equation}\]

$Proportion of suspected cholera cases that are true infections. Panel A shows the 'low' assumption which estimates across all settings: $\rho \sim \text{Beta}(5.43, 5.01)$. Panel B shows the 'high' assumption where the estimate reflects high-quality studies during outbreaks: $\rho \sim \text{Beta}(4.79, 1.53)$$

Figure 3.26: Proportion of suspected cholera cases that are true infections. Panel A shows the ‘low’ assumption which estimates across all settings: $\rho \sim \text{Beta}(5.43, 5.01)$. Panel B shows the ‘high’ assumption where the estimate reflects high-quality studies during outbreaks: $\rho \sim \text{Beta}(4.79, 1.53)$

3.7.3 Case fatality rate

The Case Fatality Rate (CFR) among symptomatic infections was calculated using reported cases and deaths data from January 2021 to August 2024. The data were collated from various issues of the WHO Weekly Epidemiological Record the Global Cholera and Acute Watery Diarrhea (AWD) Dashboard (see Data section) which provide annual aggregations of reported cholera cases and deaths. We then used the Binomial exact test (binom.test in R) to calculate the mean probability for the number of deaths (successes) given the number of reported cases (sample size), and the Clopper-Pearson method for calculating the binomial confidence intervals. We then fit Beta distributions to the mean CFR and 95% confidence intervals calculated for each country using least squares and the Nelder-Mead algorithm to give the distributional uncertainty around the CFR estimate for each country ($\mu_j$).

\[ \mu_j \sim \text{Beta}(s_{1,j}, s_{2,j}) \]

Where $s_{1,i}$ and $s_{2,j}$ are the two positive shape parameters of the Beta distribution estimated for destination $j$. By definition $\mu_j$ is the CFR for reported cases which are a subset of the total number of infections. Therefore, to infer the total number of deaths attributable to cholera infection, we assume that the CFR of observed cases is proportionally equivalent to the CFR of all cases and then calculate total deaths $D$ as follows:

\[\begin{equation} \begin{aligned} \text{CFR}_{\text{observed}} &= \text{CFR}_{\text{total}}\\ \\[3pt] \frac{[\text{observed deaths}]}{[\text{observed cases}]} &= \frac{[\text{total deaths}]}{[\text{all infections}]}\\ \\[3pt] \text{total deaths} &= \frac{[\text{observed deaths}] \times [\text{true infections}]}{[\text{observed cases}]}\\ \\[3pt] D_{jt} &= \frac{ [\sigma\rho\mu_j I_{jt}] \times [I_{jt}] }{ [\sigma\rho I_{jt}] } \end{aligned} \tag{3.19} \end{equation}\]

Table 3.9: Table 3.10: CFR Values and Beta Shape Parameters for AFRO Countries
Country	Cases (2014-2024)	Deaths (2014-2024)	CFR	CFR Lower	CFR Upper	Beta Shape1	Beta Shape2
AFRO Region	1290616	24610	0.019	0.019	0.019	0.008	1.912
Angola	3881	122	0.031	0.026	0.037	0.011	1.911
Burundi	5695	41	0.007	0.005	0.010	0.007	1.902
Benin	3617	56	0.015	0.012	0.020	0.008	1.906
Burkina Faso	7	0	0.019	0.019	0.019	0.008	1.912
Cote d’Ivoire	446	18	0.040	0.024	0.063	0.013	1.863
Cameroon	29978	926	0.031	0.029	0.033	0.010	1.929
Democratic Republic of Congo	324021	5857	0.018	0.018	0.019	0.008	1.899
Congo	144	10	0.019	0.019	0.019	0.008	1.912
Comoros	11171	153	0.014	0.012	0.016	0.008	1.896
Ethiopia	73920	928	0.013	0.012	0.013	0.007	1.912
Ghana	35107	293	0.008	0.007	0.009	0.007	1.908
Guinea	1	0	0.019	0.019	0.019	0.008	1.912
Guinea-Bissau	11	2	0.019	0.019	0.019	0.008	1.912
Kenya	47956	683	0.014	0.013	0.015	0.008	1.925
Liberia	580	0	0.000	0.000	0.006	0.006	1.938
Mali	12	4	0.019	0.019	0.019	0.008	1.912
Mozambique	85493	335	0.004	0.004	0.004	0.006	1.881
Malawi	62916	1859	0.030	0.028	0.031	0.010	1.888
Namibia	485	13	0.027	0.014	0.045	0.012	2.021
Niger	12705	357	0.028	0.025	0.031	0.010	1.897
Nigeria	265652	7242	0.027	0.027	0.028	0.009	1.891
Rwanda	453	0	0.000	0.000	0.008	0.007	1.926
Sudan	362	11	0.030	0.015	0.054	0.012	1.855
Somalia	134839	1849	0.014	0.013	0.014	0.008	1.906
South Sudan	56108	1140	0.020	0.019	0.022	0.009	1.915
Eswatini	2	0	0.019	0.019	0.019	0.008	1.912
Chad	1359	90	0.066	0.054	0.081	0.015	1.857
Togo	771	38	0.049	0.035	0.067	0.014	1.866
Tanzania	45865	667	0.015	0.013	0.016	0.008	1.915
Uganda	9286	182	0.020	0.017	0.023	0.009	1.906
South Africa	1403	47	0.033	0.025	0.044	0.012	2.008
Zambia	30686	894	0.029	0.027	0.031	0.010	1.893
Zimbabwe	45684	793	0.017	0.016	0.019	0.008	1.903

Case Fatality Rate (CFR) and Total Cases by Country in the AFRO Region from 2014 to 2024. Panel A: Case Fatality Ratio (CFR) with 95% confidence intervals. Panel B: total number of cholera cases. The AFRO Region is highlighted in black, all countries with less than 3/0.2 = 150 total reported cases are assigned the mean CFR for AFRO.

Figure 3.27: Case Fatality Rate (CFR) and Total Cases by Country in the AFRO Region from 2014 to 2024. Panel A: Case Fatality Ratio (CFR) with 95% confidence intervals. Panel B: total number of cholera cases. The AFRO Region is highlighted in black, all countries with less than 3/0.2 = 150 total reported cases are assigned the mean CFR for AFRO.

Beta distributions of the overall Case Fatality Rate (CFR) from 2014 to 2024. Examples show the overall CFR for the AFRO region (2%) in black, Congo with the highest CFR (7%) in red, and South Sudan with the lowest CFR (0.1%) in blue.

Figure 3.28: Beta distributions of the overall Case Fatality Rate (CFR) from 2014 to 2024. Examples show the overall CFR for the AFRO region (2%) in black, Congo with the highest CFR (7%) in red, and South Sudan with the lowest CFR (0.1%) in blue.

3.8 Demographics

The model includes basic demographic change by using reported birth and death rates for each of the $j$ countries, $b_j$ and $d_j$ respectively. These rates are static and defined by the United Nations Department of Economic and Social Affairs Population Division World Population Prospects 2024. Values for $b_j$ and $d_j$ are derived from crude rates and converted to birth rate per day and death rate per day (shown in Table 3.11).

Table 3.11: Table 3.12: Demographic for AFRO countries in 2023. Data include: total population as of January 1, 2023, daily birth rate, and daily death rate. Values are calculate from crude birth and death rates from UN World Population Prospects 2024.
Country	Population	Birth rate	Death rate
Algeria	45831343	0.0000542	1.28e-05
Angola	36186956	0.0001046	1.93e-05
Benin	13934166	0.0000940	2.44e-05
Botswana	2459937	0.0000683	1.58e-05
Burkina Faso	22765636	0.0000877	2.21e-05
Burundi	13503998	0.0000935	1.87e-05
Cameroon	27997833	0.0000937	1.99e-05
Cape Verde	521047	0.0000339	1.39e-05
Central African Republic	5064592	0.0001292	2.63e-05
Chad	18767684	0.0001196	3.11e-05
Comoros	842267	0.0000793	1.99e-05
Congo	6108142	0.0000849	1.74e-05
Côte d’Ivoire	30783520	0.0000887	2.12e-05
Democratic Republic of Congo	104063312	0.0001150	2.37e-05
Equatorial Guinea	1825480	0.0000821	2.18e-05
Eritrea	3438999	0.0000789	1.67e-05
Eswatini	1224706	0.0000663	2.12e-05
Ethiopia	127028360	0.0000886	1.65e-05
Gabon	2457715	0.0000766	1.74e-05
Gambia	2666786	0.0000843	1.74e-05
Ghana	33467371	0.0000728	1.95e-05
Guinea	14229395	0.0000939	2.53e-05
Guinea-Bissau	2129290	0.0000832	1.95e-05
Kenya	54793511	0.0000750	2.00e-05
Lesotho	2298496	0.0000664	2.93e-05
Liberia	5432670	0.0000858	2.24e-05
Madagascar	30813475	0.0000890	2.09e-05
Malawi	20832833	0.0000871	1.49e-05
Mali	23415909	0.0001113	2.40e-05
Mauritania	4948362	0.0000957	1.54e-05
Mauritius	1274659	0.0000254	2.39e-05
Mozambique	33140626	0.0001042	1.95e-05
Namibia	2928037	0.0000718	1.71e-05
Niger	25727295	0.0001167	2.47e-05
Nigeria	225494749	0.0000912	3.25e-05
Rwanda	13802596	0.0000785	1.64e-05
São Tomé & Príncipe	228558	0.0000780	1.54e-05
Senegal	17867073	0.0000816	1.55e-05
Seychelles	126694	0.0000377	2.27e-05
Sierra Leone	8368119	0.0000848	2.30e-05
Somalia	18031404	0.0001198	2.74e-05
South Africa	62796883	0.0000518	2.55e-05
South Sudan	11146895	0.0000807	2.71e-05
Tanzania	65657004	0.0000979	1.61e-05
Togo	9196283	0.0000863	2.13e-05
Uganda	47981110	0.0000978	1.35e-05
Zambia	20430382	0.0000919	1.45e-05
Zimbabwe	16203259	0.0000840	2.10e-05

3.9 The reproductive number

The reproductive number is a common metric of epidemic growth that represents the average number of secondary cases generated by a primary case at a specific time during an epidemic. We track how $R$ changes over time by estimating the instantaneous reproductive number $R_t$ as described in Cori et al 2013. We track $R_t$ across all metapopulations in the model to give $R_{jt}$ using the following formula:

\[\begin{equation} R_{jt} = \frac{I_{jt}}{\sum_{\Delta t=1}^{t} g(\Delta t) I_{j,t-\Delta t}} \tag{3.20} \end{equation}\]

Where $I_{jt}$ is the number of new infections in destination $j$ at time $t$, and $g(\Delta t)$ represents the probability value from the generation time distribution of cholera. This is accomplished by using the weighed sum in the denominator which is highly influenced by the generation time distribution.

3.10 Initial conditions

Since the first version of the model begins in January 2023 (to leverage available weekly data), we must estimate the initial state of population immunity. Our approach is as follows:

Reported Cases and Infections:
We start by using historical data to determine the total number of reported cholera cases for a given location over the previous X years. Because only symptomatic cases are reported, we multiply the reported case counts by $1/\sigma$ (where $\sigma$ is the proportion of infections that are symptomatic) to approximate the total number of infections.
Natural Immunity:
Next, we estimate the number of individuals who acquired immunity through natural infection during the past X years. This involves adjusting the total infections by accounting for immune decay, governed by the parameter $\varepsilon$.
Vaccine-derived Immunity:
We also sum the total number of vaccinations administered over the past X years. This number is adjusted by the vaccine effectiveness (denoted $\phi$) and the waning immunity rate ($\omega$) to estimate the current number of individuals with vaccine-derived immunity.
Deconvolution:
Finally, we combine these estimates using a deconvolution approach based on the estimated immune decay parameters (from both vaccination and natural infection) to set the model’s initial conditions.

In total, the initial conditions reflect: - The estimated total number infected (backed out from reported cases), - The number immune due to natural infection, and - The number immune from past vaccination campaigns.

3.11 Model calibration

Model calibration is performed to fine-tune the hyperparameters and ensure the model accurately represents the observed data. Our calibration strategy involves:

Latin Hypercube Sampling (LHS):
We use LHS to explore the parameter space efficiently. This method helps generate a diverse set of hyperparameter combinations for evaluation.
Likelihood Fitting:
For each set of hyperparameters, the model likelihood is computed based on the observed incidence and death data. The calibration process searches for the hyperparameter set that maximizes the model’s likelihood.
Data Challenges:
A key challenge in calibration is the incomplete or aggregated nature of the available data. To address this, we incorporate methods that allow flexible fitting even when data are sparse or reported at different temporal scales.

By combining LHS and likelihood-based calibration, we aim to identify a robust set of hyperparameters that accurately capture both the temporal and spatial dynamics of cholera transmission in Sub-Saharan Africa.

3.12 Table of MOSAIC framework countries

Table 3.13: Listof MOSAIC Countries with Cholera News
ISO	Country	Region	Cholera News
BDI	Burundi	Eastern Africa	Cholera News: Burundi
ERI	Eritrea	Eastern Africa	Cholera News: Eritrea
ETH	Ethiopia	Eastern Africa	Cholera News: Ethiopia
KEN	Kenya	Eastern Africa	Cholera News: Kenya
MWI	Malawi	Eastern Africa	Cholera News: Malawi
MOZ	Mozambique	Eastern Africa	Cholera News: Mozambique
RWA	Rwanda	Eastern Africa	Cholera News: Rwanda
SOM	Somalia	Eastern Africa	Cholera News: Somalia
SSD	South Sudan	Eastern Africa	Cholera News: South Sudan
TZA	Tanzania	Eastern Africa	Cholera News: Tanzania
UGA	Uganda	Eastern Africa	Cholera News: Uganda
ZMB	Zambia	Eastern Africa	Cholera News: Zambia
ZWE	Zimbabwe	Eastern Africa	Cholera News: Zimbabwe
AGO	Angola	Middle Africa	Cholera News: Angola
CMR	Cameroon	Middle Africa	Cholera News: Cameroon
CAF	Central African Republic	Middle Africa	Cholera News: Central African Republic
TCD	Chad	Middle Africa	Cholera News: Chad
COD	Democratic Republic of the Congo	Middle Africa	Cholera News: Democratic Republic of the Congo
GNQ	Equatorial Guinea	Middle Africa	Cholera News: Equatorial Guinea
GAB	Gabon	Middle Africa	Cholera News: Gabon
COG	Republic of the Congo	Middle Africa	Cholera News: Republic of the Congo
BWA	Botswana	Southern Africa	Cholera News: Botswana
SWZ	Kingdom of eSwatini	Southern Africa	Cholera News: Kingdom of eSwatini
NAM	Namibia	Southern Africa	Cholera News: Namibia
ZAF	South Africa	Southern Africa	Cholera News: South Africa
BEN	Benin	Western Africa	Cholera News: Benin
BFA	Burkina Faso	Western Africa	Cholera News: Burkina Faso
CIV	Côte d’Ivoire	Western Africa	Cholera News: Côte d’Ivoire
GHA	Ghana	Western Africa	Cholera News: Ghana
GIN	Guinea	Western Africa	Cholera News: Guinea
GNB	Guinea-Bissau	Western Africa	Cholera News: Guinea-Bissau
LBR	Liberia	Western Africa	Cholera News: Liberia
MLI	Mali	Western Africa	Cholera News: Mali
MRT	Mauritania	Western Africa	Cholera News: Mauritania
NER	Niger	Western Africa	Cholera News: Niger
NGA	Nigeria	Western Africa	Cholera News: Nigeria
SEN	Senegal	Western Africa	Cholera News: Senegal
SLE	Sierra Leone	Western Africa	Cholera News: Sierra Leone
GMB	The Gambia	Western Africa	Cholera News: The Gambia
TGO	Togo	Western Africa	Cholera News: Togo

3.13 Table of model parameters

Parameter	Description	Distribution	Source
$i$	Index representing the origin metapopulation.
$j$	Index representing the destination metapopulation.
$t$	Time step (one week).
$b_{jt}$	Birth rate of population $j$.		UN World Population Prospects
$d_{jt}$	Mortality rate of population $j$.		UN World Population Prospects
$N_{jt}$	Population size of destination $j$ at time $t$.
$S_{jt}$	Number of susceptible individuals in destination $j$ at time $t$.
$V_{1,jt}$	Number of individuals with one-dose vaccination in destination $j$ at time $t$.
$V_{2,jt}$	Number of individuals with two-dose vaccination in destination $j$ at time $t$.
$I_{1,jt}$	Number of symptomatic infected individuals in destination $j$ at time $t$.
$I_{2,jt}$	Number of asymptomatic infected individuals in destination $j$ at time $t$.
$W_{jt}$	Amount of V. cholerae in the environment in destination $j$ at time $t$.
$R_{jt}$	Number of recovered (immune) individuals in destination $j$ at time $t$.
$\Lambda_{j,t+1}$	Human-to-human force of infection in destination $j$ at time $t+1$.
$\Psi_{j,t+1}$	Environment-to-human force of infection in destination $j$ at time $t+1$.
$\iota$	The incubation period of cholera infection	$1.4 \ \text{days} \ (1.3–1.6 \ 95\% \text{CI})$	Azman et al 2013
$\phi_1$	Vaccine effectiveness of one-dose OCV.
$\phi_2$	Vaccine effectiveness of two-dose OCV.
$\nu_{jt}$	Vaccination rate (OCV doses administered) in destination $j$ at time $t$.
$\omega_1$	Waning immunity rate of vaccinated individuals with one-dose OCV.
$\omega_2$	Waning immunity rate of vaccinated individuals with two-dose OCV.
$\varepsilon$	Waning immunity rate of recovered individuals.
$\gamma_1$	Recovery rate of symptomatic infected individuals.
$\gamma_2$	Recovery rate of asymptomatic infected individuals.
$\mu$	Mortality rate due to V. cholerae infection.
$\sigma$	Proportion of infections that are symptomatic.
$\rho$	Proportion of suspected cases that are true infections.
$\zeta_1$	Shedding rate of V. cholerae by symptomatic individuals.	$0.1\mbox{-}10$	Fung 2014
$\zeta_2$	Shedding rate of V. cholerae by asymptomatic individuals.	$0.1\mbox{-}10$	Fung 2014
$\delta$	Environmental decay rate of V. cholerae.	Determined dynamically in model based on $\psi_{jt}$.
$\delta_{\text{min}}$	Minimum decay rate when $\psi_{jt}=0$.	$0.333 \ (3 \ \text{days})$
$\delta_{\text{max}}$	Maximum decay rate when $\psi_{jt}=1$.	$0.011 \ (90 \ \text{days})$
$\psi_{jt}$	Environmental suitability of V. cholerae in destination $j$ at time $t$.	Estimated by LSTM-RNN model.
$\beta_{j0}^{\text{hum}}$	Baseline human-to-human transmission rate in destination $j$.
$\beta_{jt}^{\text{hum}}$	Seasonal human-to-human transmission rate in destination $j$ at time $t$.
$\beta_{j0}^{\text{env}}$	Baseline environment-to-human transmission rate in destination $j$.
$\beta_{jt}^{\text{env}}$	Environment-to-human transmission rate in destination $j$ at time $t$.
$a_1$	First Fourier cosine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
$b_1$	First Fourier sine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
$a_2$	Second Fourier cosine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
$b_2$	Second Fourier sine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
$p$	Period of the seasonal cycle (set to days).	$365$
$\alpha_1$	Exponent on infectious individuals in the force of infection numerator.	$0.95$	Glass et al 2003
$\alpha_2$	Exponent on population size in the force of infection denominator; determines density (0) vs frequency (1) dependence.	$0.95$	McCallum et al 2001
$\tau_i$	Probability an individual departs from origin $i$.
$\pi_{ij}$	Probability of travel from origin $i$ to destination $j$ given departure.
$\theta_{j}$	Proportion with adequate WASH in destination $j$.	See Figure 3.15.	Sikder et al 2023
$\kappa$	Concentration of V. cholerae (cells/mL) required for 50% infection probability.	$10^5\mbox{-}10^6$	Fung 2014

3.14 Table of stochastic transitions

Term	Description	Stochastic.Transition
$\mathbf{S}$ (susceptible)
$+ b_{jt} N_{jt}$	New individuals entering the susceptible class from births.	$\text{Pois}\big( N_{jt}b_{jt} \big)$
$+ \varepsilon R_{jt}$	Loss of immunity for recovered individuals.	$\text{Binom}\big( R_{jt},\; 1 - \exp(-\varepsilon) \big)$
$+ \omega_1 V_{1,jt}$	Waning immunity from one-dose OCV.	$\text{Binom}\big( V_{1,jt},\; 1 - \exp(-\omega_1) \big)$
$+ \omega_2 V_{2,jt}$	Waning immunity from two-dose OCV.	$\text{Binom}\big( V_{2,jt},\; 1 - \exp(-\omega_2) \big)$
$- \nu_{1,jt}S_{jt}/(S_{jt} + E_{jt})$	Susceptible individuals receiving one-dose OCV (leaving $S$).	$\text{Pois}\Big( \nu_{1,jt} \cdot \frac{S_{jt}}{(S_{jt}+E_{jt})} \Big)$
$- \Lambda^{S}_{j,t+1}$	Human-to-human force of infection on the susceptible class.	$\text{Binom}\Big((1-\tau_{j})S_{jt},\ 1 - \exp\big({-\beta_{jt}^{\text{hum}} ((1-\tau_{j})I_{jt} + \sum_{\forall i \not= j} (\pi_{ij}\tau_iI_{it}))^{\alpha_1} / N_{jt}^{\alpha_2}}\big)\Big)$
$+ \Psi^S_{j,t+1}$	Environment-to-human force of infection on the susceptible class.	$\text{Binom}\Big((1-\tau_{j})S_{jt},\ 1 - \exp\big({-\beta_{jt}^{\text{env}} (1-\theta_j) W_{jt} / (\kappa+W_{jt})}\big)\Big)$
$- d_{jt} S_{jt}$	Background death among susceptible individuals.	$\text{Binom}\big( S_{jt},\; 1 - \exp(-d_{jt}) \big)$
$\mathbf{V_1}$ (one-dose OCV)
$+ \nu_{1,jt} S_{jt}/(S_{jt} + E_{jt})$	Entry of susceptible individuals into the one-dose vaccinated class.	$\text{Pois}\Big( \nu_{1,jt} \cdot \frac{S_{jt}}{(S_{jt}+E_{jt})} \Big)$
$- \omega_1 V_{1,jt}$	Waning immunity in the one-dose vaccinated class.	$\text{Binom}\big( V_{1,jt},\; 1 - \exp(-\omega_1) \big)$
$- \Lambda^{V_1}_{j,t+1}$	Human-to-human force of infection on the one-dose vaccinated class.	$\text{Binom}\Big((1-\tau_{j})(1-\phi_1)V_{1,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{hum}} ((1-\tau_{j})I_{jt} + \sum_{\forall i \not= j} (\pi_{ij}\tau_iI_{it}))^{\alpha_1} / N_{jt}^{\alpha_2}}\big)\Big)$
$+ \Psi^{V_1}_{j,t+1}$	Environment-to-human force of infection on one-dose vaccinated class.	$\text{Binom}\Big((1-\tau_{j})(1-\phi_1)V_{1,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{env}} (1-\theta_j) W_{jt} / (\kappa+W_{jt})}\big)\Big)$
$- d_{jt} V_{1,jt}$	Background death among one-dose vaccinated individuals.	$\text{Binom}\big( V_{1,jt},\; 1 - \exp(-d_{jt}) \big)$
$\mathbf{V_2}$ (two-dose OCV)
$+ \nu_{2,jt}$	Transition of one-dose vaccinated individuals to the two-dose vaccinated class (full course of OCV).	$\text{Pois}\big( \nu_{2,jt} \big)$
$- \omega_2 V_{2,jt}$	Waning immunity in the two-dose vaccinated class.	$\text{Binom}\big( V_{2,jt},\; 1 - \exp(-\omega_2) \big)$
$- \Lambda^{V_2}_{j,t+1}$	Human-to-human force of infection on the two-dose vaccinated class.	$\text{Binom}\Big((1-\tau_{j})(1-\phi_2)V_{2,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{hum}} ((1-\tau_{j})I_{jt} + \sum_{\forall i \not= j} (\pi_{ij}\tau_iI_{it}))^{\alpha_1} / N_{jt}^{\alpha_2}}\big)\Big)$
$+ \Psi^{V_2}_{j,t+1}$	Environment-to-human force of infection on the two-dose vaccinated class.	$\text{Binom}\Big((1-\tau_{j})(1-\phi_2)V_{2,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{env}} (1-\theta_j) W_{jt} / (\kappa+W_{jt})}\big)\Big)$
$- d_{jt} V_{2,jt}$	Background death among two-dose vaccinated individuals.	$\text{Binom}\big( V_{2,jt},\; 1 - \exp(-d_{jt}) \big)$
$\mathbf{E}$ (exposed)
$+ \Lambda_{j,t+1}$	Human-to-human force of infection contributing to new exposures.	$\Lambda^{S}_{j,t+1} + \Lambda^{V_1}_{j,t+1} + \Lambda^{V_2}_{j,t+1}$
$+ \Psi_{j,t+1}$	Environment-to-human force of infection contributing to new exposures.	$\Psi^{S}_{j,t+1} + \Psi^{V_1}_{j,t+1} + \Psi^{V_2}_{j,t+1}$
$- \iota E_{jt}$	Progression of exposed individuals toward the infectious class.	$\text{Binom}\big( E_{jt},\; 1 - \exp(-\iota) \big)$
$- d_{jt} E_{jt}$	Background death among exposed individuals.	$\text{Binom}\big( E_{jt},\; 1 - \exp(-d_{jt}) \big)$
$\mathbf{I_1}$ (symptomatic)
$+ \sigma\,\iota\,E_{jt}$	Exposed individuals progressing to symptomatic infection.	$\text{Binom}\big( \sigma E_{jt},\; 1 - \exp(-\iota) \big)$
$- \gamma_1 I_{1,jt}$	Recovery from symptomatic infection.	$\text{Binom}\big( I_{1,jt},\; 1 - \exp(-\gamma_1) \big)$
$- \mu_j I_{1,jt}$	Deaths due to symptomatic infection.	$\text{Binom}\big( I_{1,jt},\; 1 - \exp(-\mu_j) \big)$
$- d_{jt} I_{1,jt}$	Background death among individuals with symptomatic infection.	$\text{Binom}\big( I_{1,jt},\; 1 - \exp(-d_{jt}) \big)$
$\mathbf{I_2}$ (asymptomatic)
$+ (1-\sigma)\,\iota\,E_{jt}$	Exposed individuals progressing to asymptomatic infection.	$\text{Binom}\big( (1-\sigma) E_{jt},\; 1 - \exp(-\iota) \big)$
$- \gamma_2 I_{2,jt}$	Recovery from asymptomatic infection.	$\text{Binom}\big( I_{2,jt},\; 1 - \exp(-\gamma_2) \big)$
$- d_{jt} I_{2,jt}$	Background death among individuals with asymptomatic infection.	$\text{Binom}\big( I_{2,jt},\; 1 - \exp(-d_{jt}) \big)$
$\mathbf{W}$ (environment)
$+ \zeta_1 I_{1,jt}$	Amount of V. cholerae (cells/ml) shed into the environment by symptomatic individuals.	$\text{Pois}\big( \zeta_1 I_{1,jt} \big)$
$+ \zeta_2 I_{2,jt}$	Amount of V. cholerae (cells/ml) shed into the environment by asymptomatic individuals.	$\text{Pois}\big( \zeta_2 I_{2,jt} \big)$
$- \delta_{jt} W_{jt}$	Decay of viable V. cholerae in the environment.	$\text{Pois}\big( \delta_{jt} W_{jt} \big)$
$\mathbf{R}$ (recovered)
$+ \gamma_1 I_{1,jt}$	Recovery of individuals with symptomatic infection.	$\text{Binom}\big( I_{1,jt},\; 1 - \exp(-\gamma_1) \big)$
$+ \gamma_2 I_{2,jt}$	Recovery of individuals with asymptomatic infection.	$\text{Binom}\big( I_{2,jt},\; 1 - \exp(-\gamma_2) \big)$
$- \varepsilon R_{jt}$	Loss of immunity for recovered individuals.	$\text{Binom}\big( R_{jt},\; 1 - \exp(-\varepsilon) \big)$
$- d_{jt} R_{jt}$	Background death among recovered individuals.	$\text{Binom}\big( R_{jt},\; 1 - \exp(-d_{jt}) \big)$

3.15 Table of vaccination model terms

Term	Population	Equation	Notes
$V^{\text{imm}}_{1,j,t+1}$	Effectively immunized one-dose recipients	$V^{\text{imm}}_{1,j,t+1} = V^{\text{imm}}_{1,jt}$ $+ \ \phi_1 \nu_{1, jt} \cdot S_{jt} \big/ \big(S_{jt} + E_{jt}\big)$ $- \ \omega_1 V^{\text{imm}}_{1,jt}$ $- \ \nu_{2,jt} \cdot V^{\text{imm}}_{1,jt} \big/ \big(V^{\text{imm}}_{1,jt} + V^{\text{sus}}_{1,jt} \big)$	+ Incoming newly vaccinated - Waning vaccine immunity (⇒ $V^{\text{sus}}_{1}$) - Second dose recipients (⇒ $V_2$ compartment)
$V^{\text{sus}}_{1,j,t+1}$	Still susceptible one-dose recipients	$V^{\text{sus}}_{1,j,t+1} = V^{\text{sus}}_{1,jt}$ $+ \ (1 - \phi_1) \nu_{1, jt}$ $+ \ \omega_1 V^{\text{imm}}_{1,jt}$ $- \ \big(\Lambda^{V_1}_{j,t+1} + \Psi^{V_1}_{j,t+1}\big)$ $- \ \nu_{2, jt} \cdot V^{\text{sus}}_{1,jt} \big/ \big(V^{\text{imm}}_{1,jt} + V^{\text{sus}}_{1,jt} \big)$	+ Incoming newly vaccinated + Waning vaccine immunity - Infected (⇒ $E_{j,t}$) - Second dose recipients (⇒ $V_2$ compartment)
$V^{\text{inf}}_{1,j,t+1}$	Infected one-dose recipients	$V^{\text{inf}}_{1,j,t+1} = V^{\text{inf}}_{1,jt}$ $+ \ \big(\Lambda^{V_1}_{j,t+1} + \Psi^{V_1}_{j,t+1}\big)$	+ One-dose recipients infected (⇒ $E_{j,t}$) Compartment used for tracking only.
$V^{\text{imm}}_{2,j,t+1}$	Effectively immunized two-dose recipients	$V^{\text{imm}}_{2,j,t+1} = V^{\text{imm}}_{2,jt}$ $+ \ \phi_2 \nu_{2, jt}$ $- \ \omega_2 V^{\text{imm}}_{2,jt}$	+ Incoming second dose recipients - Waning vaccine immunity (⇒ $V^{\text{sus}}_{2}$)
$V^{\text{sus}}_{2,j,t+1}$	Still susceptible two-dose recipients	$V^{\text{sus}}_{2,j,t+1} = V^{\text{sus}}_{2,jt}$ $+ \ (1 - \phi_2) \nu_{2,jt}$ $+ \ \omega_2 V^{\text{imm}}_{2,jt}$ $- \ \big(\Lambda^{V_2}_{j,t+1} + \Psi^{V_2}_{j,t+1}\big)$	+ Incoming second dose recipients + Waning vaccine immunity - Infected (⇒ $E_{j,t}$)
$V^{\text{inf}}_{2,j,t+1}$	Infected two-dose recipients	$V^{\text{inf}}_{2,j,t+1} = V^{\text{inf}}_{2}$ $+ \ \big(\Lambda^{V_2}_{j,t+1} + \Psi^{V_2}_{j,t+1}\big)$	+ Infected two-dose recipients (⇒ $E_{j,t}$). Compartment used for tracking only.
$V_{1,j,t}$	Total one-dose recipients	$V_{1,j,t} = V^{\text{imm}}_{1,j,t} + V^{\text{sus}}_{1,j,t} + V^{\text{inf}}_{1,j,t}$	Sum of all one-dose sub-compartments. Tracked only and approximately equal to reported OCV campaign data. Compartment used for tracking only.
$V_{2,j,t}$	Total two-dose recipients	$V_{2,j,t} = V^{\text{imm}}_{2,j,t} + V^{\text{sus}}_{2,j,t} + V^{\text{inf}}_{2,j,t}$	Sum of all two-dose sub-compartments. Tracked only and approximately equal to reported OCV campaign data. Compartment used for tracking only.

2 Data

4 Model versions

Parameter	Description	Distribution	Source
\(i\)	Index representing the origin metapopulation.
\(j\)	Index representing the destination metapopulation.
\(t\)	Time step (one week).
\(b_{jt}\)	Birth rate of population \(j\).		UN World Population Prospects
\(d_{jt}\)	Mortality rate of population \(j\).		UN World Population Prospects
\(N_{jt}\)	Population size of destination \(j\) at time \(t\).
\(S_{jt}\)	Number of susceptible individuals in destination \(j\) at time \(t\).
\(V_{1,jt}\)	Number of individuals with one-dose vaccination in destination \(j\) at time \(t\).
\(V_{2,jt}\)	Number of individuals with two-dose vaccination in destination \(j\) at time \(t\).
\(I_{1,jt}\)	Number of symptomatic infected individuals in destination \(j\) at time \(t\).
\(I_{2,jt}\)	Number of asymptomatic infected individuals in destination \(j\) at time \(t\).
\(W_{jt}\)	Amount of V. cholerae in the environment in destination \(j\) at time \(t\).
\(R_{jt}\)	Number of recovered (immune) individuals in destination \(j\) at time \(t\).
\(\Lambda_{j,t+1}\)	Human-to-human force of infection in destination \(j\) at time \(t+1\).
\(\Psi_{j,t+1}\)	Environment-to-human force of infection in destination \(j\) at time \(t+1\).
\(\iota\)	The incubation period of cholera infection	\(1.4 \ \text{days} \ (1.3–1.6 \ 95\% \text{CI})\)	Azman et al 2013
\(\phi_1\)	Vaccine effectiveness of one-dose OCV.
\(\phi_2\)	Vaccine effectiveness of two-dose OCV.
\(\nu_{jt}\)	Vaccination rate (OCV doses administered) in destination \(j\) at time \(t\).
\(\omega_1\)	Waning immunity rate of vaccinated individuals with one-dose OCV.
\(\omega_2\)	Waning immunity rate of vaccinated individuals with two-dose OCV.
\(\varepsilon\)	Waning immunity rate of recovered individuals.
\(\gamma_1\)	Recovery rate of symptomatic infected individuals.
\(\gamma_2\)	Recovery rate of asymptomatic infected individuals.
\(\mu\)	Mortality rate due to V. cholerae infection.
\(\sigma\)	Proportion of infections that are symptomatic.
\(\rho\)	Proportion of suspected cases that are true infections.
\(\zeta_1\)	Shedding rate of V. cholerae by symptomatic individuals.	\(0.1\mbox{-}10\)	Fung 2014
\(\zeta_2\)	Shedding rate of V. cholerae by asymptomatic individuals.	\(0.1\mbox{-}10\)	Fung 2014
\(\delta\)	Environmental decay rate of V. cholerae.	Determined dynamically in model based on \(\psi_{jt}\).
\(\delta_{\text{min}}\)	Minimum decay rate when \(\psi_{jt}=0\).	\(0.333 \ (3 \ \text{days})\)
\(\delta_{\text{max}}\)	Maximum decay rate when \(\psi_{jt}=1\).	\(0.011 \ (90 \ \text{days})\)
\(\psi_{jt}\)	Environmental suitability of V. cholerae in destination \(j\) at time \(t\).	Estimated by LSTM-RNN model.
\(\beta_{j0}^{\text{hum}}\)	Baseline human-to-human transmission rate in destination \(j\).
\(\beta_{jt}^{\text{hum}}\)	Seasonal human-to-human transmission rate in destination \(j\) at time \(t\).
\(\beta_{j0}^{\text{env}}\)	Baseline environment-to-human transmission rate in destination \(j\).
\(\beta_{jt}^{\text{env}}\)	Environment-to-human transmission rate in destination \(j\) at time \(t\).
\(a_1\)	First Fourier cosine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
\(b_1\)	First Fourier sine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
\(a_2\)	Second Fourier cosine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
\(b_2\)	Second Fourier sine coefficient for seasonality.	See Table 3.1.	Altizer et al 2006
\(p\)	Period of the seasonal cycle (set to days).	\(365\)
\(\alpha_1\)	Exponent on infectious individuals in the force of infection numerator.	\(0.95\)	Glass et al 2003
\(\alpha_2\)	Exponent on population size in the force of infection denominator; determines density (0) vs frequency (1) dependence.	\(0.95\)	McCallum et al 2001
\(\tau_i\)	Probability an individual departs from origin \(i\).
\(\pi_{ij}\)	Probability of travel from origin \(i\) to destination \(j\) given departure.
\(\theta_{j}\)	Proportion with adequate WASH in destination \(j\).	See Figure 3.15.	Sikder et al 2023
\(\kappa\)	Concentration of V. cholerae (cells/mL) required for 50% infection probability.	\(10^5\mbox{-}10^6\)	Fung 2014

Term	Description	Stochastic.Transition
\(\mathbf{S}\) (susceptible)
\(+ b_{jt} N_{jt}\)	New individuals entering the susceptible class from births.	\(\text{Pois}\big( N_{jt}b_{jt} \big)\)
\(+ \varepsilon R_{jt}\)	Loss of immunity for recovered individuals.	\(\text{Binom}\big( R_{jt},\; 1 - \exp(-\varepsilon) \big)\)
\(+ \omega_1 V_{1,jt}\)	Waning immunity from one-dose OCV.	\(\text{Binom}\big( V_{1,jt},\; 1 - \exp(-\omega_1) \big)\)
\(+ \omega_2 V_{2,jt}\)	Waning immunity from two-dose OCV.	\(\text{Binom}\big( V_{2,jt},\; 1 - \exp(-\omega_2) \big)\)
\(- \nu_{1,jt}S_{jt}/(S_{jt} + E_{jt})\)	Susceptible individuals receiving one-dose OCV (leaving \(S\)).	\(\text{Pois}\Big( \nu_{1,jt} \cdot \frac{S_{jt}}{(S_{jt}+E_{jt})} \Big)\)
\(- \Lambda^{S}_{j,t+1}\)	Human-to-human force of infection on the susceptible class.	\(\text{Binom}\Big((1-\tau_{j})S_{jt},\ 1 - \exp\big({-\beta_{jt}^{\text{hum}} ((1-\tau_{j})I_{jt} + \sum_{\forall i \not= j} (\pi_{ij}\tau_iI_{it}))^{\alpha_1} / N_{jt}^{\alpha_2}}\big)\Big)\)
\(+ \Psi^S_{j,t+1}\)	Environment-to-human force of infection on the susceptible class.	\(\text{Binom}\Big((1-\tau_{j})S_{jt},\ 1 - \exp\big({-\beta_{jt}^{\text{env}} (1-\theta_j) W_{jt} / (\kappa+W_{jt})}\big)\Big)\)
\(- d_{jt} S_{jt}\)	Background death among susceptible individuals.	\(\text{Binom}\big( S_{jt},\; 1 - \exp(-d_{jt}) \big)\)
\(\mathbf{V_1}\) (one-dose OCV)
\(+ \nu_{1,jt} S_{jt}/(S_{jt} + E_{jt})\)	Entry of susceptible individuals into the one-dose vaccinated class.	\(\text{Pois}\Big( \nu_{1,jt} \cdot \frac{S_{jt}}{(S_{jt}+E_{jt})} \Big)\)
\(- \omega_1 V_{1,jt}\)	Waning immunity in the one-dose vaccinated class.	\(\text{Binom}\big( V_{1,jt},\; 1 - \exp(-\omega_1) \big)\)
\(- \Lambda^{V_1}_{j,t+1}\)	Human-to-human force of infection on the one-dose vaccinated class.	\(\text{Binom}\Big((1-\tau_{j})(1-\phi_1)V_{1,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{hum}} ((1-\tau_{j})I_{jt} + \sum_{\forall i \not= j} (\pi_{ij}\tau_iI_{it}))^{\alpha_1} / N_{jt}^{\alpha_2}}\big)\Big)\)
\(+ \Psi^{V_1}_{j,t+1}\)	Environment-to-human force of infection on one-dose vaccinated class.	\(\text{Binom}\Big((1-\tau_{j})(1-\phi_1)V_{1,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{env}} (1-\theta_j) W_{jt} / (\kappa+W_{jt})}\big)\Big)\)
\(- d_{jt} V_{1,jt}\)	Background death among one-dose vaccinated individuals.	\(\text{Binom}\big( V_{1,jt},\; 1 - \exp(-d_{jt}) \big)\)
\(\mathbf{V_2}\) (two-dose OCV)
\(+ \nu_{2,jt}\)	Transition of one-dose vaccinated individuals to the two-dose vaccinated class (full course of OCV).	\(\text{Pois}\big( \nu_{2,jt} \big)\)
\(- \omega_2 V_{2,jt}\)	Waning immunity in the two-dose vaccinated class.	\(\text{Binom}\big( V_{2,jt},\; 1 - \exp(-\omega_2) \big)\)
\(- \Lambda^{V_2}_{j,t+1}\)	Human-to-human force of infection on the two-dose vaccinated class.	\(\text{Binom}\Big((1-\tau_{j})(1-\phi_2)V_{2,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{hum}} ((1-\tau_{j})I_{jt} + \sum_{\forall i \not= j} (\pi_{ij}\tau_iI_{it}))^{\alpha_1} / N_{jt}^{\alpha_2}}\big)\Big)\)
\(+ \Psi^{V_2}_{j,t+1}\)	Environment-to-human force of infection on the two-dose vaccinated class.	\(\text{Binom}\Big((1-\tau_{j})(1-\phi_2)V_{2,jt},\ 1 - \exp\big({-\beta_{jt}^{\text{env}} (1-\theta_j) W_{jt} / (\kappa+W_{jt})}\big)\Big)\)
\(- d_{jt} V_{2,jt}\)	Background death among two-dose vaccinated individuals.	\(\text{Binom}\big( V_{2,jt},\; 1 - \exp(-d_{jt}) \big)\)
\(\mathbf{E}\) (exposed)
\(+ \Lambda_{j,t+1}\)	Human-to-human force of infection contributing to new exposures.	\(\Lambda^{S}_{j,t+1} + \Lambda^{V_1}_{j,t+1} + \Lambda^{V_2}_{j,t+1}\)
\(+ \Psi_{j,t+1}\)	Environment-to-human force of infection contributing to new exposures.	\(\Psi^{S}_{j,t+1} + \Psi^{V_1}_{j,t+1} + \Psi^{V_2}_{j,t+1}\)
\(- \iota E_{jt}\)	Progression of exposed individuals toward the infectious class.	\(\text{Binom}\big( E_{jt},\; 1 - \exp(-\iota) \big)\)
\(- d_{jt} E_{jt}\)	Background death among exposed individuals.	\(\text{Binom}\big( E_{jt},\; 1 - \exp(-d_{jt}) \big)\)
\(\mathbf{I_1}\) (symptomatic)
\(+ \sigma\,\iota\,E_{jt}\)	Exposed individuals progressing to symptomatic infection.	\(\text{Binom}\big( \sigma E_{jt},\; 1 - \exp(-\iota) \big)\)
\(- \gamma_1 I_{1,jt}\)	Recovery from symptomatic infection.	\(\text{Binom}\big( I_{1,jt},\; 1 - \exp(-\gamma_1) \big)\)
\(- \mu_j I_{1,jt}\)	Deaths due to symptomatic infection.	\(\text{Binom}\big( I_{1,jt},\; 1 - \exp(-\mu_j) \big)\)
\(- d_{jt} I_{1,jt}\)	Background death among individuals with symptomatic infection.	\(\text{Binom}\big( I_{1,jt},\; 1 - \exp(-d_{jt}) \big)\)
\(\mathbf{I_2}\) (asymptomatic)
\(+ (1-\sigma)\,\iota\,E_{jt}\)	Exposed individuals progressing to asymptomatic infection.	\(\text{Binom}\big( (1-\sigma) E_{jt},\; 1 - \exp(-\iota) \big)\)
\(- \gamma_2 I_{2,jt}\)	Recovery from asymptomatic infection.	\(\text{Binom}\big( I_{2,jt},\; 1 - \exp(-\gamma_2) \big)\)
\(- d_{jt} I_{2,jt}\)	Background death among individuals with asymptomatic infection.	\(\text{Binom}\big( I_{2,jt},\; 1 - \exp(-d_{jt}) \big)\)
\(\mathbf{W}\) (environment)
\(+ \zeta_1 I_{1,jt}\)	Amount of V. cholerae (cells/ml) shed into the environment by symptomatic individuals.	\(\text{Pois}\big( \zeta_1 I_{1,jt} \big)\)
\(+ \zeta_2 I_{2,jt}\)	Amount of V. cholerae (cells/ml) shed into the environment by asymptomatic individuals.	\(\text{Pois}\big( \zeta_2 I_{2,jt} \big)\)
\(- \delta_{jt} W_{jt}\)	Decay of viable V. cholerae in the environment.	\(\text{Pois}\big( \delta_{jt} W_{jt} \big)\)
\(\mathbf{R}\) (recovered)
\(+ \gamma_1 I_{1,jt}\)	Recovery of individuals with symptomatic infection.	\(\text{Binom}\big( I_{1,jt},\; 1 - \exp(-\gamma_1) \big)\)
\(+ \gamma_2 I_{2,jt}\)	Recovery of individuals with asymptomatic infection.	\(\text{Binom}\big( I_{2,jt},\; 1 - \exp(-\gamma_2) \big)\)
\(- \varepsilon R_{jt}\)	Loss of immunity for recovered individuals.	\(\text{Binom}\big( R_{jt},\; 1 - \exp(-\varepsilon) \big)\)
\(- d_{jt} R_{jt}\)	Background death among recovered individuals.	\(\text{Binom}\big( R_{jt},\; 1 - \exp(-d_{jt}) \big)\)

Term	Population	Equation	Notes
\(V^{\text{imm}}_{1,j,t+1}\)	Effectively immunized one-dose recipients	\(V^{\text{imm}}_{1,j,t+1} = V^{\text{imm}}_{1,jt}\) \(+ \ \phi_1 \nu_{1, jt} \cdot S_{jt} \big/ \big(S_{jt} + E_{jt}\big)\) \(- \ \omega_1 V^{\text{imm}}_{1,jt}\) \(- \ \nu_{2,jt} \cdot V^{\text{imm}}_{1,jt} \big/ \big(V^{\text{imm}}_{1,jt} + V^{\text{sus}}_{1,jt} \big)\)	+ Incoming newly vaccinated - Waning vaccine immunity (⇒ \(V^{\text{sus}}_{1}\)) - Second dose recipients (⇒ \(V_2\) compartment)
\(V^{\text{sus}}_{1,j,t+1}\)	Still susceptible one-dose recipients	\(V^{\text{sus}}_{1,j,t+1} = V^{\text{sus}}_{1,jt}\) \(+ \ (1 - \phi_1) \nu_{1, jt}\) \(+ \ \omega_1 V^{\text{imm}}_{1,jt}\) \(- \ \big(\Lambda^{V_1}_{j,t+1} + \Psi^{V_1}_{j,t+1}\big)\) \(- \ \nu_{2, jt} \cdot V^{\text{sus}}_{1,jt} \big/ \big(V^{\text{imm}}_{1,jt} + V^{\text{sus}}_{1,jt} \big)\)	+ Incoming newly vaccinated + Waning vaccine immunity - Infected (⇒ \(E_{j,t}\)) - Second dose recipients (⇒ \(V_2\) compartment)
\(V^{\text{inf}}_{1,j,t+1}\)	Infected one-dose recipients	\(V^{\text{inf}}_{1,j,t+1} = V^{\text{inf}}_{1,jt}\) \(+ \ \big(\Lambda^{V_1}_{j,t+1} + \Psi^{V_1}_{j,t+1}\big)\)	+ One-dose recipients infected (⇒ \(E_{j,t}\)) Compartment used for tracking only.
\(V^{\text{imm}}_{2,j,t+1}\)	Effectively immunized two-dose recipients	\(V^{\text{imm}}_{2,j,t+1} = V^{\text{imm}}_{2,jt}\) \(+ \ \phi_2 \nu_{2, jt}\) \(- \ \omega_2 V^{\text{imm}}_{2,jt}\)	+ Incoming second dose recipients - Waning vaccine immunity (⇒ \(V^{\text{sus}}_{2}\))
\(V^{\text{sus}}_{2,j,t+1}\)	Still susceptible two-dose recipients	\(V^{\text{sus}}_{2,j,t+1} = V^{\text{sus}}_{2,jt}\) \(+ \ (1 - \phi_2) \nu_{2,jt}\) \(+ \ \omega_2 V^{\text{imm}}_{2,jt}\) \(- \ \big(\Lambda^{V_2}_{j,t+1} + \Psi^{V_2}_{j,t+1}\big)\)	+ Incoming second dose recipients + Waning vaccine immunity - Infected (⇒ \(E_{j,t}\))
\(V^{\text{inf}}_{2,j,t+1}\)	Infected two-dose recipients	\(V^{\text{inf}}_{2,j,t+1} = V^{\text{inf}}_{2}\) \(+ \ \big(\Lambda^{V_2}_{j,t+1} + \Psi^{V_2}_{j,t+1}\big)\)	+ Infected two-dose recipients (⇒ \(E_{j,t}\)). Compartment used for tracking only.
\(V_{1,j,t}\)	Total one-dose recipients	\(V_{1,j,t} = V^{\text{imm}}_{1,j,t} + V^{\text{sus}}_{1,j,t} + V^{\text{inf}}_{1,j,t}\)	Sum of all one-dose sub-compartments. Tracked only and approximately equal to reported OCV campaign data. Compartment used for tracking only.
\(V_{2,j,t}\)	Total two-dose recipients	\(V_{2,j,t} = V^{\text{imm}}_{2,j,t} + V^{\text{sus}}_{2,j,t} + V^{\text{inf}}_{2,j,t}\)	Sum of all two-dose sub-compartments. Tracked only and approximately equal to reported OCV campaign data. Compartment used for tracking only.

Value(s)	Units	Infection	Description	Source
\(10^3\)	\(\text{cells}~\text{g}^{-1}~\text{stool}\)	Asymptomatic	Approx. 1 day of shedding at ~10³ vibrios per gram of stool	Mosley et al. (1968)
\(10^6\)–\(10^9\)	\(\text{cells}~\text{g}^{-1}~\text{stool}\)	NA	Number of fecal coliform indicator bacteria in human feces	Feachem et al. (1983)
\(1\)–\(100\)	\(\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}\)	All	Point estimate of 10; range 1–100 used in sensitivity analysis	Codeço (2001)
\(10\)	\(\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}\)	All	Assumed shedding rate used in epidemic model incorporating hyperinfectivity	Hartley et al. (2006)
\(\leq 10^5\)	\(\text{cells}~\text{g}^{-1}~\text{stool}\)	Asymptomatic	No symptoms; low-level shedding of vibrios	Nelson et al. (2009)
\(\leq 10^8\)	\(\text{cells}~\text{g}^{-1}~\text{stool}\)	Mild	Diarrhoea with moderate vibrios in stool	Nelson et al. (2009)
\(10^7\)–\(10^9\)	\(\text{cells}~\text{g}^{-1}~\text{stool}\)	Severe	Vomiting and profuse diarrhoea with high shedding	Nelson et al. (2009)
\(10^{10}\)–\(10^{12}\)	\(\text{cells}~\text{L}^{-1}~\text{stool}\)	Severe	Concentration in rice water stool from symptomatic individuals	Nelson et al. (2009)
\(0.01\)–\(10\)	\(\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}\)	All	Reported as general estimate across all infections	Fung (2014)
\(10\)–\(100\)	\(\text{cells}~\text{mL}^{-1}~\text{person}^{-1}~\text{day}^{-1}\)	All	Represents shedding rates in two distinct sub-populations	Njagarah & Nyabadza (2014)