Using Breeding Bird Survey and eBird data to improve marsh bird monitoring abundance indices and trends

. The elusive nature of many marsh-breeding birds presents a challenge for effective population monitoring. The Great Lakes Marsh Monitoring Program (GLMMP), delivered by Birds Canada, addressed these challenges by concentrating survey efforts in marsh bird habitats and by using survey protocols aimed at maximizing marsh bird detections. GLMMP data suggest that numerous marsh bird species are declining. Here we consider the value of other avian monitoring programs to support our understanding of marsh bird population trends. Our goal was to compare the GLMMP, North American Breeding Bird Survey (BBS), and eBird with each other and with a combined survey, by evaluating frequency of detection, annual indices of abundance, and trend estimates. Using 23 years (1997–2019) of GLMMP, BBS, and eBird data, we calculated annual indices of abundance and trends for each survey for 18 marsh-breeding species across southern Ontario, Canada. We found that the GLMMP had more frequent detections, greater counts, and/or more precise trends for 8 species that breed almost exclusively in marshes, whereas 10 species with more variable habitat preferences had more frequent detections, greater counts, and/or more precise trends based on eBird and/or BBS. We found that combining counts from the GLMMP, BBS, and eBird increased the precision around trend estimates for 11/18 (61%) species; however, trend estimates for combined data tended to be positively biased relative to GLMMP trends for species that also frequent non-marsh habitats. We, therefore, provide evidence that combining citizen science data from multiple sources could increase the power to detect changes in marsh-dependent bird populations. Integrated datasets thus provide a promising avenue for future marsh bird conservation and management


INTRODUCTION
Many birds that rely on marshes for foraging and breeding (i.e., marsh-dependent species) are considered "secretive" due to their low densities, infrequent vocalizations, inconspicuous behavior, and inaccessible habitats (Conway andGibbs 2011, Steidl et al. 2013). Detecting marsh birds is difficult, given their tendency to remain silent and concealed within thick stands of emergent (above water) marsh vegetation (Conway and Gibbs 2011). As a result, standard survey methods, which do not specifically target marsh birds and their habitats, typically have low detection probabilities, and rigorous estimates of population status and trends are lacking for members of this group in many areas (ECCC 2019). Birds Canada, in partnership with Environment and Climate Change Canada and the United States Environmental Protection Agency, launched the Marsh Monitoring Program in 1995 to monitor wetland health using birds as indicator species (Birds Canada 2021). This volunteer citizen science program follows structured, standardized survey protocols aimed at maximizing marsh bird detections in order to collect occupancy and count data for these elusive birds (Bird Studies Canada 2009, Conway 2011. The program expanded beyond the Great Lakes in southern Québec in 2004, the Canadian Prairies from 2008 to 2012, parts of the Maritime Provinces in 2012, and southern British Columbia in 2021. Since its inception, Great Lakes Marsh Monitoring Program (GLMMP) participants have completed over 45,000 surveys at over 4600 marsh sites located predominantly throughout the southern portion of the Great Lakes basin. In this region, the GLMMP has achieved greater marsh bird detection rates than other, less specialized, avian monitoring programs (e.g., elusive marsh birds detected at > 21% of GLMMP survey routes per species per year on average versus < 5% in the North American Breeding Bird Survey [BBS]; Tozer 2016, Pardieck et al. 2020, which has allowed researchers to estimate population trends with greater power and precision (Steidl et al. 2013, Miller 2016. Previous analyses of the GLMMP dataset detected declines for numerous marsh bird species between the mid-1990s and early 2010s (e.g., Timmermans et al. 2008, Tozer 2016. A recent examination of GLMMP data across the southern Great Lakes basin determined that the abundance of 6/18 (33%) marshbreeding bird species had declined by 2.6−8.4% per year from 1995 to 2018 (Tozer 2020). Over the same period, 7/18 (39%) species increased, and 5/18 (28%) species showed no overall change (Tozer 2020), suggesting that, overall, marsh bird populations are stable. As a result of declining populations and/ or rarity, various marsh birds are currently listed under the Ontario Endangered Species Act (2007) or are recognized as priority species of conservation concern under the Bird Conservation Strategy for the Lower Great Plains/St. Lawrence Plain Bird Conservation Region (Environment Canada 2014). For declining species in particular, determining the best measures of marsh bird population status and trends will be vital for making appropriate and effective conservation and management decisions.
To this end, it is important to consider the value of other avian monitoring programs to support our understanding of patterns of change in marsh bird populations. In North America, the BBS is a leading bird-monitoring program, and eBird is a widespread data collection program now being used to estimate trends. These citizen science projects cover broad geographic and temporal scales and, in some instances, can provide high-resolution data to fill potential information gaps . The BBS provides long-term monitoring data for over 500 avian species across most of North America and is generally considered the most reliable source of avian population data in Canada and the United States (Downes et al. 2016, Sauer et al. 2013, 2017. The BBS is a structured survey program that uses a formal sampling design and a standardized field protocol. Although BBS data have been used for decades to estimate trends of marsh birds (Sauer et al. 2020), previous work has shown that its passive, roadside surveys only sample a limited portion of marsh bird habitats (Robbins et al. 1986, Gibbs and Melvin 1997, Lawler and O'Connor 2004 and do not detect marsh birds as frequently as call-broadcast surveys, which increase vocalization probability (Bogner andBaldassarre 2002, Conway andGibbs 2011). While the BBS may have adequate data for continentalscale analysis, data may be inadequate at more restricted national or regional scales. Nevertheless, the utility of the BBS either alone or combined with other datasets for monitoring populations of marsh-dependent species has not been previously investigated. eBird may be a promising source of information to support marsh bird monitoring. eBird is a global citizen science program that collects information about the distribution, occurrence, and abundance of avian species (Sullivan et al. 2009(Sullivan et al. , 2014. Volunteers record observations from discrete locations and dates, and bird sightings are submitted in the form of checklists. Observations can be collected as either unstructured (incidental sightings) or semi-structured data (Hertzog et al. 2021). The eBird database has grown exponentially since the program's inception in 2002, and over 55 million checklists have been completed across North America (as of June 2022). Therefore, eBird has the potential to provide more data and greater spatial coverage compared to structured surveys, at lower cost and management effort (Sullivan et al. 2014). Previous studies have successfully modeled long-term avian population trends using eBird data and have shown that these trends broadly agree with trends estimated using the BBS for most of the species examined (Walker and Taylor 2017, Horns et al. 2018) and that they agree with trends estimated using other citizen science monitoring datasets (e.g., Common Breeding Bird Survey in Germany; Hertzog et al. 2021). Furthermore, eBird data have also been used to evaluate the population change of species that are challenging to survey and for which few other sources of data exist (e.g., boreal and Arctic breeding birds; Walker and Taylor 2020).
Integrating data from multiple different sources also presents a possible method for supporting marsh bird monitoring. There is increasing evidence that data integration improves species distribution and trend estimates (e.g., Pacifici et al. 2017, Walker and Taylor 2017, Reich et al. 2018, Fletcher et al. 2019, Miller et al. 2019, Zipkin et al. 2019, Isaac et al. 2020, Hertzog et al. 2021. In particular, integrating differently structured data may combine the strengths of both structured (e.g., lower geographical bias) and unstructured datasets (e.g., larger amounts of data, lower research costs, and greater spatial coverage; Reich et al. 2018, Fletcher et al. 2019, Hertzog et al. 2021. Combining data from multiple sources can improve model estimate accuracy (i.e., the closeness of measurements to the "true" value) by alleviating the biases of individual datasets (Pacifici et al. 2017, Miller et al. 2019, Matthiopolos et al. 2022. Additionally, data integration can improve the precision of model estimates (i.e., the closeness of measurements to each other) by increasing the sample size for analysis , Hetzog et al. 2021, Matthiopolos et al. 2022. Moreover, data integration has the potential to generate more robust trend estimates for species with low abundance or that are difficult to detect by utilizing all available monitoring data (reviewed in Hertzog et al. 2021).
Our objective was to compare marsh bird monitoring data from the GLMMP, BBS, eBird, and a combination of these three datasets. We analyzed annual indices and trends of 18 marshbreeding species from across southern Ontario, Canada, including relatively rare and common species, as well as both marsh-dependent breeding species and species that nest both in wetlands and uplands, so we could explore potential differences across a range of abundances and life histories. We chose southern Ontario as our study area because it had good GLMMP, BBS, and eBird coverage and contained a large portion of the breeding distributions of many wetland-dependent species (Tacha and Braune 1994).

Survey methods and sampling frames
The GLMMP is a citizen science monitoring program delivered by Birds Canada (Birds Canada 2021). GLMMP data are freely available on the Nature Counts website (https://birdscanada.org/ naturecounts/default/main.jsp). The GLMMP was originally established in 1995 for the purpose of evaluating the quality and recovery of wetlands around the Great Lakes basin using birds as indicators of wetland health. Over time, this purpose has evolved and expanded to include trend monitoring (Tozer 2016, Tozer 2020 and to address various questions of interest to wetland conservationists and managers (Tozer et al. 2018. Specific details of the GLMMP survey protocol can be found in Bird Studies Canada (2009). GLMMP participants establish their own routes or are assigned an existing survey route, which consist of 1 to 8 semi-circular 100 m radius plots (hereafter "sites") located within a single marsh or multiple marshes. Sites are spaced > 250 m apart to reduce double-counting individuals. In each year that a route is active, sites are surveyed on at least two and optionally 3 different visits ≥ 10 days apart between 20 May and 5 July. Surveys are conducted in either the morning (surveys can begin 30 min before sunrise and end no later than 10:00 am) or evening (surveys begin no earlier than 4 h before sunset and are completed by dark). Since 2008, sites are surveyed for 15 min during each visit. This is split into 5 min of passive (silent) observation, followed by 5 min of observation as species calls are broadcast, followed by a final 5 min of passive observation. From 1995 to 2007 sites were surveyed for only 10 min during each visit, consisting of 5 min of call-broadcasts followed by 5 min of passive observation. All analyses in this study are based on data filtered to match the latter 10-min pattern across all years. During broadcasts, 30 sec of vocalizations followed by 30 sec of silence are played for each of the following: Least Bittern (Ixobrychus exilis), Sora (Porzana carolina), Virginia Rail (Rallus limicola), a mix of American Coot (Fulica americana) and Common Gallinule (Gallinula galeata), and Pied-billed Grebe (Podilymbus podiceps). The broadcast encourages these elusive species to reveal themselves by approaching or vocally responding. During each survey, skilled participants record the number of individuals of all detected species at each site and the time of detection. Surveys are only performed under favorable weather conditions, when there is good visibility, warm temperatures (≥ 16°C), no precipitation, and little to no wind. Marsh bird habitat descriptions are also recorded each year for each site.
Most GLMMP routes are chosen haphazardly by participants. However, improvements to the GLMMP sampling frame have been made over time to reflect evolving changes in the GLMMP's purpose. These changes include the addition of a set of randomly chosen survey routes. These routes have been sampled annually since 2016 and comprise ~25% of the routes sampled each year in southern Ontario. Preliminary analysis comparing marsh bird results at the random stations versus the participant-chosen stations shows little or no bias in abundance estimates for 9 out of 10 marsh bird species tested (Tozer 2020).
The BBS was launched in 1966 with the purpose of instituting a continental monitoring program for breeding bird populations in North America (Sauer et al. 2017). BBS is a standardized survey conducted along randomly established roadside survey routes. Data were available for a total of 83 BBS routes in our southern Ontario study area at the time of analysis (data available at https:// www.pwrc.usgs.gov/bbs/rawdata/). Routes are surveyed once annually, between 28 May and 7 July by experienced volunteer or professional birders. Surveys begin 30 min before local sunrise and are typically surveyed in 4-5 h. Each route is composed of 50 stops, spaced 800 m apart, and volunteers progress from stop to stop in sequence. Stops are surveyed for 3 min each, during which a single observer detects, identifies, and counts the number of individuals of each species seen or heard within a 400 m radius of each stop. Supplemental information, including data on weather and vehicle and other background noise, is also collected during the survey. eBird is the largest global database of bird observations (Sullivan et al. 2014). eBird users record their observations on dates and locations of their choosing, and bird sightings are submitted in the form of checklists, which are permanently archived and openly accessible on the eBird website (https://ebird.org/home). Typically, eBird users assign their observations to one of four core protocols (more specialized protocols also exist: https://support. ebird.org/en/support/home): traveling (observers recorded sightings along a route > 30 m from starting point of checklist), stationary (surveys occurred at a single, fixed location), historical (retrospective data entry), and incidental (birdwatching was not the observer's primary focus when a bird was observed). Under the traveling and stationary protocols, users record the amount of time spent birding and distance traveled (for traveling protocols). Users also indicate whether they are submitting complete checklists, wherein all birds present and identifiable are reported. Complete checklists collected under the traveling and stationary protocols are considered semi-structured data (Hertzog et al. 2021). To date (as of June 2022), a total of 2.8 million complete checklists have been collected from Ontario.

Study area and species
We restricted our analysis to the Ontario portion of the Lower Great Lakes/St. Lawrence Plain Bird Conservation Region (i.e., BCR 13), excluding Manitoulin Island (Fig. 1). Manitoulin Island was omitted as comparable landscape-scale spatial wetland data were not available for this location. BCR 13 contains important lakeshores and wetlands and hosts some of the greatest densities of migrant waterbirds in eastern North America (U.S. NABCI 2000). Our study area also had good spatial coverage for all three monitoring programs throughout the 1997 to 2019 study period, with a total of 11,525 GLMMP surveys (from 2306 sites along 709 routes), 53,950 BBS surveys (from 3300 stops along 66 routes), and 118,462 eBird checklists completed during these years.

Data collection and filtering
GLMMP data were downloaded from NatureCounts (Birds Canada 2020), and BBS data were obtained from the online BBS database (2020 data release; Pardieck et al. 2020). GLMMP data are commonly analyzed at the site level (e.g., Crewe et al. 2005, Tozer 2016, and eBird data are typically analyzed at the checklist level (e.g., Taylor 2017, Horns et al. 2018). We, therefore, considered BBS counts at the stop-level to make our analyses more analogous among datasets. Stop-level BBS data were available from 1997 to 2019 (date range: 28 May-7 July), so we filtered the GLMMP and eBird datasets to only include data collected from 1997 onwards. All data were zero-filled so that any species not reported on a site/stop/checklist were given a count of zero (Strimas-Mackey et al. 2020).
We downloaded the December 2020 release of the eBird basic dataset (eBird 2021) for Ontario and our 18 study species using the auk package (Strimas-Mackey et al. 2018) in R version 4.0.3 (R Core Team 2020). Data were filtered following the instructions in Strimas-Mackey et al. (2020) to only include complete checklists collected from 1997 to 2019 and between 20 May and 7 July of each year (i.e., a date range inclusive of the timing of both the GLMMP and BBS surveys) using standard traveling or stationary count protocols. To minimize variations due to differences in sampling methods, eBird data were filtered to only include checklists where observations were made for ≤ 5 h, over ≤ 5 km, and by ≤ 10 observers ). In addition, we reduced shared checklists from multi-person birding groups into a single checklist and assigned a unique observer identifier to each multi-person group (as in Taylor 2017, 2020). eBird data were also temporally filtered according to the GLMMP survey morning and evening time windows. eBird checklists were thus filtered to only include observations made either between 30 min before sunrise and 10 am or between 4 h before sunset and dusk. We determined the time of sunrise, sunset, and dusk for each checklist location and date of observation using the getSunlightTimes function in the suncalc package (Thieurmel and Elmarhraoui 2019).
The GLMMP, BBS, and eBird datasets were spatially filtered to only include sites, stops, or checklists within our study area (the BCR 13 spatial data layer from Bird Studies Canada and NABCI [2014] was used for filtering). Previous studies show that marsh bird occupancy and abundance are positively associated with marsh cover (Tozer et al. 2010, Tozer 2016. By design, all GLMMP surveys occur in wetlands or portions of wetlands dominated by emergent marsh vegetation (Bird Studies Canada 2009). To minimize variations due to spatial differences in sampling, BBS and eBird data were further filtered to better match the spatial distribution of GLMMP surveys. For each species, BBS and eBird data were filtered to only include stops/checklists that were within 100 or 200 m of a marsh. Marsh land cover was generated following the methods of Tozer et al. (2020), using a dissolved combination of marsh polygons from the Ontario Wetlands layer (https://www.ontario.ca/page/land-informationontario) and the Great Lakes Coastal Wetland Inventory (https:// www.greatlakeswetlands.org/Home.vbhtml). We drew 100 or 200 m buffers around the perimeter of each marsh, and all BBS stops and eBird checklists within either these 100 or 200 m buffers or within the wetland itself were included in our analyses. Buffer distances varied by species and were based on in-person effective detection radii for marsh birds reported by Stewart et al. (2020), where species with quieter calls were assigned a 100 m detection radius and species with farther-carrying calls were given a 200 m detection radius (Table 1). All spatial filtering was performed in QGIS (version 3.4.9-Madeira; QGIS Development Team 2017). Due to incompleteness of the marsh land cover spatial layer, which is missing polygons for some marshes within the study area (particularly for numerous smaller marshes throughout the study area; personal observation), we also included BBS stops and eBird checklist locations outside of the spatial layer where at least one of the following marsh-dependent species was detected in at least one year, as detections of these species would indicate the presence of a marsh: American Bittern, American Coot, Black Tern, Common Gallinule, Least Bittern, Marsh Wren, Pied-billed Grebe, Sora, Swamp Sparrow, and Virginia Rail. For the 100 m detection radius, 55% of eBird checklists and 86% of BBS stops were added in this way, and for the 200 m detection radius, 42% of eBird checklists and 74% of BBS stops were added in this way. We note that we could not identify any marshes missing from the spatial layer where none of these species were detected, which could bias upwards estimates of frequency of detection for eBird and the BBS. We also note that location coordinates for eBird traveling checklists completed at eBird sites do not indicate the entire route traveled by an observer, and were therefore much less accurate than coordinates for stationary point counts conducted at GLMMP and BBS sites. This is because eBird observations from a single checklist, for example, might be made several hundred meters or more apart. As such, our analysis may include checklists with coordinates that fall within 100 or 200 m of a marsh, but for which routes were not near a marsh for most of their length. Conversely, we may have excluded checklists where observations occurred near a marsh, but for which reported coordinates were outside of our 100 or 200 m buffers. This introduces additional errors to eBird estimates, as our spatially filtered eBird dataset may include some observations from nonmarsh bird habitats and may omit some marsh-associated observations. We also note that the same eBird site or nearly the same eBird site might have been given different names and that some eBird sites might have been visited more than once in the same year. We suspect that these shortcomings had little impact on our results, although we have no solid evidence to support our speculation.
Finally, we explored how data from all three programs performed together by merging filtered GLMMP, BBS, and eBird data into a single dataset (hereafter referred to as "combined" data). Unlike the GLMMP and BBS, eBird data contained an additional variable for protocol type (stationary or traveling). Therefore, to ensure that all individual datasets contained the same information and were identical in structure, we set protocol type to stationary for all GLMMP and BBS surveys (details in Statistical Analysis). This allowed us to directly join the GLMMP, BBS, and eBird data into one combined dataset. We also added a variable to the combined dataset to denote whether a survey was collected as a part of the GLMMP, BBS, or eBird.

Statistical analysis
We calculated frequency of occurrence of each species in the GLMMP, BBS, eBird, and combined datasets by dividing number of occupied sites/stops/checklists by the total number of site-years (GLMMP), stop-years (BBS), checklist-years (eBird), or a combination of site-, stop-, and checklist-years (combined).
We compared annual abundance indices and trends generated from GLMMP, BBS, eBird, and combined data following the approach of Taylor (2017, 2020). First, we fit models relating species counts to a set of predictor variables using generalized linear mixed-effects models (GLMMs). In our final, filtered datasets, all GLMMP and BBS observations and most eBird checklists (> 96%) included species count data. We, therefore, used total counts of each species at each site, stop, or checklist as a response variable. We note that total counts for the GLMMP consisted of the maximum count from any one of the 2-3 annual visits made to each site. As total counts were overdispersed, models were fit using negative binomial GLMMs with log-link functions (Zuur et al. 2009). For all models, inclusion of a zero-inflation parameter did not improve model fit, as determined by AIC scores (Burnham and Anderson 2002). Therefore, we did not account for zero-inflation in our models.
For all four datasets, models were fit for each species that included fixed effects for year, number of observers, and the total number of species counted at each site, stop, and/or checklist. eBird models also included protocol type (traveling or stationary) as a fixed effect, and combined models also included protocol and data source (GLMMP, BBS, or eBird) as fixed effects.
Year was encoded as a factor to produce annual indices of abundance for each year. This facilitated comparisons of annual abundance among datasets when nonlinear temporal patterns occurred (Walker and Taylor 2017). We used total number of species counted (encoded as a continuous variable; log transformed to improve model fit) and number of observers as proxies for observer skill and effort (as in Roberts et al. 2007, Szabo et al. 2010, Walker and Taylor 2017, Horns et al. 2018. These factors were also included in all models to make the data and model structure identical across the three datasets, in part to facilitate direct comparison of annual indices. All models included a random intercept for observer identity to account for further variation in observer skill. We also included a random intercept to account for the spatial grouping of surveys. GLMMP sites and BBS stops were grouped by route. eBird checklists were similarly grouped by locality, a name assigned to all surveys conducted in the same location (typically, localities are eBird "hotspots" where a high number of observers have submitted checklists; https://ebird.org/hotspots). Therefore, GLMMP and BBS models included a random intercept for route, whereas eBird models included a random intercept for locality identity, and combined models included a random intercept that denoted each survey's route (for GLMMP and BBS observations) or locality (for eBird observations).
Inclusion of data origin as a covariate in combined models allowed us to compare mean species counts across all years in the GLMMP, BBS, and eBird datasets. We computed least-squares means using the lsmeans function in the emmeans package (Lenth 2022), and differences among means were evaluated using the sidak multiple comparisons test from the cdl function in the multcomp package (Hothorn et al. 2008).
We compared GLMM fits using the model_performance function in the performance package (Lüdecke et al. 2020). Goodness of fit was compared using marginal and conditional R 2 values. Marginal R 2 indicates the variance explained by the fixed effects, and conditional R 2 indicates the variance explained by the entire model (i.e., both the fixed and random effects; Nakagawa and Schielzeth 2013). Model accuracy was compared by calculating root mean square error (RMSE) and residual standard deviation (Sigma), which are measures of the difference between observed and predicted values (Lüdecke et al. 2020). Differences in model fit metrics among datasets were evaluated by assessing whether confidence intervals for each dataset overlapped.
Annual indices of abundance were estimated by calculating predicted values for each year (fitted as a factor) from the above GLMMs. Predicted values were generated using the ggpredict function in the ggeffects package (Lüdecke 2018). Predicted values were computed by holding the total number of species counted at its mean value, protocol type at stationary (eBird and combined models only), data source at GLMMP (combined model only), and random effects at the population level (i.e., NA). For each species, annual indices were compared by calculating weighted Pearson's correlation coefficients between the indices of each combination of datasets (completed using the wtd.cor function in the weights package; Pasek et al. 2020). For each comparison, we used as weights the inverse sum of the variance around annual parameter estimates of the two datasets being compared, such that years with greater uncertainty were given less weight (Walker and Taylor 2017). GLMM fits were further compared by generating side-by-side plots of the annual indices of abundance of each dataset. We fit LOESS smoothers with spans of 1 through the annual indices for each species to aid with visual interpretations of temporal trends in abundance.
We estimated trends in abundance for each species and dataset by fitting linear models (LMs) relating annual indices in abundance to year. These LMs included annual indices generated from the above GLMMs as the response variable and year as a fixed effect, with annual sample sizes as weights.
Year was encoded as a continuous variable to assess temporal trends, and was weighted by sample size to reflect greater uncertainty in years with fewer observations. Trends were estimated with annual indices rather than raw counts because increases in the number of surveys over time, particularly for eBird (where the number of checklists increased exponentially in the study area from 137 in 1997 to > 21,000 in 2019), would generate trends that were disproportionately influenced by more recent years (Walker and Taylor 2020). Lines of best fit through annual indices from the above GLMMs were estimated using the predict function in the stats package (R Core Team 2020). Trends in abundance were then calculated as the geometric mean rate of change between predicted values from the line of best fit for the 1 st (1997) and last (2019) year, expressed as annual percent change as described in equation 4 of Smith et al. (2014). Confidence intervals for trends were estimated with bootstrapped models fitted using the boot package (Davison andHinkley 1997, Canty andRipley 2021). We scored the precision of each trend estimate based on 95% CI widths (upper limit -lower limit; as in ECCC 2014), where trends were classified into three precision categories: (1) high = 95% CI width < 3.5, (2) medium = 95% CI width between 3.5-6.7, and (3) low = 95% CI > 6.7. Trends with high and medium scores are sufficiently precise to identify a 30% or 50% change in abundance, respectively, over 20 years; trends with low scores are too imprecise to confidently identify a 50% change in abundance over 20 years (ECCC 2014). We evaluated whether BBS, eBird, and the combined model's trend estimates were biased relative to those of the GLMMP following the methods of Robinson et al. (2021). First, we calculated the trend difference by subtracting GLMMP trends from BBS, eBird, or the combined trends. Positive trend differences indicated that the other datasets estimated more positive trends than the GLMMP, and negative trend differences indicated that the other datasets estimated more negative trends than the GLMMP. We then quantified differences among trends as a percentage by converting each difference to a proportion of the GLMMP trend estimate. Trends generated from GLMMP, BBS, eBird, and combined models were also compared by calculating Pearson's correlation coefficients between trends across species for each combination of datasets.

Frequency of occurrence and counts
Most of the 18 marsh-breeding species included in this study were observed infrequently by all three monitoring programs (Table  2). For all species, eBird offered the greatest number of observations, with between 1.2 to 2 times more surveys completed per year than the GLMMP and BBS in our filtered datasets. Most species were detected with similar frequencies by GLMMP and eBird surveys. Notable exceptions were Virginia Rail and Marsh Wren, which were most frequently detected in GLMMP surveys, whereas Canada Goose, Common Grackle, Sandhill Crane, and Trumpeter Swan were most frequently detected in eBird surveys. For all species, except Common Grackle and Wilson's Snipe, frequency of occurrence was lowest in BBS surveys. Wilson's Snipe was less frequently detected in GLMMP surveys relative to the other two monitoring programs. Counts were highest in the GLMMP for 11 species and were lowest in BBS surveys for nine species (detailed comparisons of mean counts among datasets are shown in Appendix 1).

Model performance
Overall, GLMMP, BBS, eBird, and combined GLMMs had similar fits and predictive accuracies, as determined by 95% confidence interval overlap for all model fit metrics (Fig. 2; raw model performance values are given in Appendix 2). Marginal R 2 values tended to be highest for the combined models ( Fig. 2A), and conditional R 2 values tended to be higher for combined and eBird models (Fig. 2B); however, these differences were not statistically significant. Likewise, although BBS models tended to show the smallest differences between predicted and observed values, RMSE (Fig. 2C) and Sigma (Fig. 2D) values did not significantly differ among datasets. It should be noted, however, that numerous models had poor convergence (Appendix 2 and 3), and that poor convergence was more typical of species with low occurrence (Table 2). Our analyses of model performance metrics should, therefore, be interpreted with caution.

Annual indices and trends GLMMP, BBS, and eBird
Overall, data from the GLMMP, BBS, and eBird indicated similar changes in abundance over time. Plots of annual indices of abundance generally showed comparable temporal variations (Fig. 3, Appendix 3), and 33/54 (61%) correlations between annual indices in these three datasets were positive (Appendix 4). Moreover, population trends estimated using annual indices were positively correlated across species for all three datasets (Fig. 4). Annual indices from all datasets had similar average Pearson's correlation coefficients (r) across species, with values of 0.14 ± 0.33 (mean ± SD) for GLMMP and BBS indices, 0.13 ± 0.29 for GLMMP and eBird indices, and 0.13 ± 0.33 for BBS and eBird indices. However, a comparison of trends showed better agreement between GLMMP and eBird data, as trends estimated using these datasets showed the strongest correlation (r = 0.69, P = 0.001). This was compared to weaker and non-significant correlations between BBS and eBird trends (r = 0.44, P = 0.07) and between GLMMP and BBS trends (r = 0.26, P = 0.3).
Compared to the GLMMP, BBS trend estimates were less precise for most species, including seven relatively rare and/or elusive, marsh-dependent species (American Coot, Common Gallinule, Least Bittern, Mute Swan, Sora, Trumpeter Swan, and Virginia Rail) and two songbirds that breed exclusively in wetland habitats (Swamp Sparrow and Marsh Wren; Appendix 5). BBS trends were also less precise for two species that also nest and feed in nonmarsh habitats, such as upland areas, rivers, small lakes, and/or swamps (Canada Goose and Sandhill Crane). The mean (± SD) of the 95% CIs around trends for these 11 species based on the BBS was 12 (± 12) compared to 6.2 (± 5.0) for the GLMMP. Trends for Pied-billed Grebe were estimated with similar precision as the GLMMP, with a slight positive bias. Although the BBS produced higher trend precision for Red-winged Blackbird, Wilson's Snipe, and Common Yellowthroat, BBS estimates tended to be more negative compared to those of the GLMMP for these species. The BBS also generated similar or higher trend precision for Black Tern, American Bittern, and Common Grackle, but trends were positively biased relative to trends estimated from the GLMMP.
Compared to the GLMMP, eBird trend estimates were also less precise for most species, including eight rare and/or marshdependent species (Common Gallinule, Least Bittern, Marsh Wren, Mute Swan, Sora, Swamp Sparrow, Trumpeter Swan, and Virginia Rail), as well as Canada Goose and Sandhill Crane (Appendix 5). The mean (± SD) of the 95% CIs around trends for these ten species based on eBird was 15 (± 9) compared to 6.2 (± 4.9) for the GLMMP. eBird trends for Common Yellowthroat, American Coot, Pied-billed Grebe, Black Tern, Common Grackle, and Red-winged Blackbird were estimated with similar precision as the GLMMP, with a negative bias for Common Yellowthroat and a positive bias for the remaining species. eBird estimated trends with higher precision for Wilson's Snipe and American Bittern, but both estimates tended to be more negative compared to those of the GLMMP.
Often weak (r < 0.15) or negative correlations between annual indices, poor GLMM convergence, poor precision in trend estimates, and differences in trend directions among datasets accompanied species that were detected very infrequently in at least one or more of the datasets. For instance, such was the case for all correlations with the BBS annual indices for American Coot, Common Gallinule, Least Bittern, Marsh Wren, Mute Swan, Sora, and Virginia Rail, and for all correlations among annual indices for Trumpeter Swan. Similarly, trends for many species with infrequent detections were estimated with low precision (e.g., American Coot, Sandhill Crane, Trumpeter Swan). It is also important to note that annual indices from different datasets were often on very different scales. For example, the annual indices for Marsh Wren from GLMMP models were an order of magnitude higher than indices from BBS and eBird models, likely due to greater detections and counts of this species in GLMMP surveys.

Combined dataset
Overall, results for combined data were consistent with results from individual datasets. Plots of combined annual indices generally demonstrated variations over time that were similar to those of the GLMMP, BBS, and eBird data (Fig. 3, Appendix 3), and 49/54 (91%) correlations between combined indices and indices from the other three datasets were positive (Appendix 4). Furthermore, trends in abundance from the combined data were positively correlated across species with trends from the GLMMP, BBS, and eBird (Fig. 4). Annual indices from the combined models were most strongly correlated with GLMMP indices (mean r across species = 0.72 ± 0.29), while correlations between combined and BBS and eBird indices were weaker (mean r across species = 0.38 ± 0.43 and 0.50 ± 0.24, respectively). Trends estimated using combined data showed strong correlations with GLMMP and eBird trends across species (GLMMP: r = 0.85, P

Fig. 2.
Comparison of fit metrics among models used to generate annual indices of abundance using data from the Great Lakes Marsh Monitoring Program (GLMMP), North American Breeding Bird Survey (BBS), eBird, and a combination of these three datasets (Combined). Boxes show median marginal R² (A), conditional R² (B), root mean square error (RMSE; C), and residual standard deviation (Sigma; D) values for 18 study species ± the first and third quartiles, with vertical lines indicating the 95% confidence intervals, and circles indicating outlier values. Different lowercase letters indicate significant differences in fit metrics among datasets. In all panels, model fit metrics are positioned from left to right in order of worst to best fits. Y-axes were cropped to remove RMSE and Sigma outliers >4.5 to better visualize differences among datasets. < 0.0001; eBird: r = 0.91, P < 0.0001). There was also a positive correlation between combined trends and trends estimated from the BBS (r = 0.53, P = 0.03).
Combining GLMMP, BBS, and eBird data tended to narrow 95% CIs around trend estimates, such that combined datasets estimated trends with high precision for 11/18 (61%) species, and only species with low trend precision in all three original datasets (Mute Swan, Black Tern, American Coot, Sandhill Crane, and Trumpeter Swan) had low precision for combined trends (Appendix 5). Furthermore, combined trend estimates were more similar to trends estimated from GLMMP data compared to the BBS and eBird, with trend estimates within 35% difference of GLMMP trends for eight species (versus five species for eBird and three species for BBS). Trend estimates based on GLMMP data tended to be more negative than those based on BBS and eBird (positive bias in trends for 12 species based on BBS and eBird data). As a result, combined trends were positively biased for 11 species (range in percent difference from the GLMMP = 8-427%; Appendix 5).

DISCUSSION
Our study compared the GLMMP, BBS, and eBird for marsh bird monitoring. We acknowledge the many differences among these three programs in field protocols and sampling frames, and in purpose and the values of the counts they produce. Nonetheless, we see merit in exploring and comparing the relative patterns in annual indices and trends as we have done through data filtering and suitable model terms to account as much as possible for these differences.
The GLMMP produced more precise estimates for rare and/or secretive marsh-dependent species and for species that are more restricted to marshes, whereas eBird (and at times the BBS) generated more precise estimates for species that have more varied habitat preferences. Indeed, the targeted GLMMP protocol produced more frequent detections, greater counts, and/or more precise trend estimates relative to the BBS and eBird for six elusive marsh-dependent species and two songbirds that breed exclusively in wetlands. Conversely, eBird and/or the BBS performed better than the GLMMP by having more frequent detections and/or greater counts for species that also nest and feed in non-marsh habitats, such as upland areas, rivers, small lakes, and swamps.
Results for eBird point to the potential value of this dataset for marsh bird monitoring. Even when filtered to detections near marshes, eBird data covered more sites located across larger areas than the GLMMP and BBS and generated larger amounts of data for each species. Compared to the GLMMP, eBird also had more frequent detections, and higher trend precision for several species. Furthermore, positive correlations between the annual indices and trends from eBird and GLMMP indicated that both datasets tended to agree on whether a species' abundance was increasing or decreasing. The eBird dataset could, therefore, make a useful contribution to ongoing marsh bird monitoring efforts. However, further work should be completed to better understand instances where eBird data produced results that conflict with those of the GLMMP (e.g., negative correlations between GLMMP and eBird annual indices for Common Grackle, Red-winged Blackbird, Wilson's Snipe, and Sora).
Several lines of evidence show that roadside BBS surveys poorly monitor marsh birds in the type of landscape found in the Great Lakes. BBS data showed the lowest frequency of occurrence for all but two species, with 11/18 (61%) species detected in < 3% of surveys. Previous analyses have similarly reported that the BBS

Fig. 3. Example plots of annual indices for (top to bottom) Swamp Sparrow (Melospiza georgiana), Virginia Rail (Rallus limicola),
American Bittern (Botaurus lentiginosus), and Common Gallinule (Gallinula galeata), with 95% confidence intervals and smoothed trajectories (LOESS with a span of 1), as predicted by models using data from the Great Lakes Marsh Monitoring Program (GLMMP), North American Breeding Bird Survey (BBS), eBird, and a combination of these three datasets (Combined). Y-axes were cropped on plots with large confidence intervals to better visualize estimated trajectories. Plots for the remaining species are in Appendix 3. detected marsh birds at < 5% of survey routes per species per year on average across North America (Pardieck et al. 2020). With so few annual detections in our study area, generating enough statistical power to estimate linear population trends would require > 30 years of monitoring (Steidl et al. 2013). Although the BBS has > 50 years of data to identify long-term trends (Sauer et al. 2017), 30 years is too long to effectively identify short-term population declines and respond to conservation issues (Tozer 2016). Low detections and counts in the BBS dataset produced models with poor convergence for five species, and BBS annual indices and trends showed weak and negative correlations with indices and trends estimated using the other two datasets. BBS annual indices and trends were also very imprecise for several species. This is probably mostly because the spatial arrangement of wetlands and associated roads in our region mean that wetlands are infrequently sampled by randomly-placed roadside BBS survey stops, which in turn causes the very low frequency of marsh bird detections by the BBS and its associated poor model performance. By contrast, in some regions, such as the Canadian Prairies, the spatial arrangement of typically smaller and more numerous wetlands and roads might mean that BBS data perform relatively better for monitoring marsh birds compared to the Great Lakes region in our study. A comparison of how well BBS data monitor marsh birds in the Canadian Prairies relative to the Great Lakes is an area for future research. Therefore, although BBS data may be a useful addition to integrated analyses, our results suggest that BBS data alone are not effective for monitoring marsh bird populations in the type of landscape found in the Great Lakes region.
Here we performed a preliminary assessment to explore whether integrating citizen science data from multiple sources could improve monitoring of elusive marsh birds. Combining data from the GLMMP, BBS, and eBird increased trend estimate precision for many species. However, contrary to our predictions, combined trends were positively biased relative to trends estimated from the GLMMP for 11 species. It is important to note that five species with the strongest positive trend biases (>95 percent error relative to the GLMMP) were species that also nest and feed in non-marsh habitats. These results may suggest that BBS and eBird provide more reliable count data for species that also nest and feed outside of marshes. Alternatively, our results may suggest that trends for these species differ between marshes and the broader landscape.
Our study thus lends support to a growing body of literature showing the benefits of data integration for various species-level analyses, including better understanding of population trends (e. g., Pacifici et al. 2017, Walker and Taylor 2017, Reich et al. 2018, Fletcher et al. 2019, Miller et al. 2019, Zipkin et al. 2019, Isaac et al. 2020, Hertzog et al. 2021. It also lends support to continued delivery of multiple citizen science monitoring programs as sources of information for more powerful integrated analyses.
Data integration did not improve estimates for species with the most infrequent occurrence among the three monitoring programs. Indeed, all trends were estimated with low precision for American Coot, Mute Swan, Sandhill Crane, and Trumpeter Swan, which occurred in < 3% of GLMMP, BBS, and eBird surveys on average. Our study area falls within the breeding range of these species; therefore, low counts do not result from these species being at the edge of their range in the study area. For these species, trend estimates were too variable to confidently identify a 50% change in abundance over 20 years (ECCC 2014). Such uncertainty is particularly concerning for species like American Coot, which have shown steep declines (-4.8 to -8.2%/year) in previous analyses of GLMMP data from the Great Lakes region (Tozer 2016(Tozer , 2020. These results, as well as the relatively low frequencies of occurrence for most of our study species, highlight the need for improved monitoring efforts for certain marsh-dependent species to estimate trends in southern Ontario. It should be noted that in this paper we did not run predictive models aimed at thoroughly evaluating how several factors impact species abundance. Instead, our goal was to compare and contrast three citizen science bird-monitoring programs, alone and combined, using methods that were consistent across datasets. It is, therefore, not surprising that our models had relatively low marginal R 2 values, as they only included year and covariates to account for potential survey bias, rather than a suite of factors to explain each species' abundance. Previous work in the Great Lakes region has shown that numerous site, wetland, and landscape-scale factors differentially influence the occupancy dynamics of various marsh species (e.g., Saunders et al. 2019, Tozer 2016, Tozer et al. 2010, 2018. There are, therefore, factors that could be included to improve model fits and the confidence in abundance and trend estimates for each individual species. It should also be noted that we used a simple data-pooling approach to model combined data from the three monitoring programs, which might have failed to account for important differences among the datasets. There are, however, alternative model-based data integration approaches for utilizing information from multiple monitoring programs that could be pursued that might further improve model performance (e.g., Massimino et al. 2008, Hertzog et al. 2021). This is a fruitful area for future marsh bird monitoring analysis, particularly now that we have shown combined data from multiple programs is worthwhile for modeling annual indices and trends of marsh-dependent species.

CONCLUSION
Here we compared marsh bird monitoring data from the GLMMP, BBS, eBird, and a combination of these three datasets.
Our results highlight the value of the dedicated Marsh Monitoring Program protocol for tracking marsh-dependent species. The GLMMP had more frequent detections, greater counts, and/or more precise trends for species that breed almost exclusively in marshes (in particular, American Coot, Common Gallinule, Least Bittern, Marsh Wren, Pied-billed Grebe, Sora, Swamp Sparrow, Virginia Rail), whereas species with more variable habitat preferences had more frequent detections, greater counts, and/or more precise trend estimates based on eBird and/ or BBS. Combining counts from the GLMMP, BBS, and eBird increased the precision of population trends for multiple marshdependent species. Our findings, therefore, provide evidence that integrating citizen science data from these three sources improves trend estimates especially for the following marsh-dependent species: American Bittern, Common Gallinule, Least Bittern, Pied-billed Grebe, Sora, and Virginia Rail. We also conclude that useful information, not provided by the GLMMP alone, can be gathered through BBS and eBird about species with more variable habitat preferences, which sample uplands as well as wetlands. We, therefore, recommend continued support for and delivery of the MMP in particular, as well as the BBS and eBird, in the region. Our findings are likely generally applicable to other regions where the Marsh Monitoring Program operates across southern Canada.