Species-environment associations and predicted distribution of Black Oystercatcher breeding pairs in Haida Gwaii , British Columbia , Canada

We present a species distribution model (SDM) for prediction of Black Oystercatcher (Haematopus bachmani) breeding pair occurrence in Haida Gwaii, British Columbia. Boosted regression trees, a machine learning algorithm, was used to fit the model. In total, 14 predictors were selected a priori through development of a conceptual model. Breeding pair occurrence data were compiled from two available surveys conducted in 2005 and 2010 (545 km of shoreline surveyed in total). All data were aggregated to common model units (vector polyline shoreline segments approximately 100 m in length), which approximate breeding territory size. The final model, which included eight predictors (distance to treeline, island area, wave exposure, shoreline type, intertidal area within 50 m, segment length, rat occurrence, and intertidal area within 1000 m), had excellent predictive ability assessed by 10-fold cross-validation (AUC = 0.89). Predictive ability was reduced when the model was trained and tested on spatially (AUC = 0.86) and temporally (AUC = 0.83) independent data. Distance to treeline and island area had greatest influence on the model (RI = 41.5% and RI = 36.7%, respectively); we hypothesized that these predictors are related to avoidance of predators. Partial dependence plots revealed that breeding pairs tended to occur: further from the treeline, on small islands, at high wave exposures, at moderate intertidal area, on bedrock or gravel shoreline types, and on islands without rats. However, breeding pairs tended not to occur on very small islands and at very high wave exposures, which we hypothesize to reflect avoidance of nest washout. Results may inform local conservation and management efforts, i.e., from predictive maps, and eventual development of a high-resolution (~100 m) model for prediction of Black Oystercatcher breeding pairs at a regional scale. Further, methods and GIS data sets developed may be used to model distribution of other coastal species in the region. Association espèce-environnement et répartition prédite de couples d'Huîtrier de Bachman nichant sur Haida Gwaii, Colombie-Britannique, Canada RÉSUMÉ. Nous présentons un modèle de répartition d'espèce pour prédire l'occurrence de couples nicheurs d'Huîtrier de Bachman (Haematopus bachmani) sur Haida Gwaii, en Colombie-Britannique. La technique de boosting d'arbre de régression, un algorithme d'apprentissage automatique, a été utilisée pour ajuster le modèle. Quatorze variables explicatives ont été sélectionnées a priori lors de l'élaboration du modèle conceptuel. Les données d'occurrence de couples nicheurs ont été compilées à partir de deux relevés effectués en 2005 et 2010 (545 km de rive inventoriés au total). Toutes les données ont été regroupées en une unité commune de modélisation (segment de rive en vecteur polyligne, de 100 m de longueur environ), correspondant approximativement à la taille du territoire de nidification. Le modèle final, qui comprenait huit variables explicatives (distance à la forêt, superficie de l'île, exposition aux vagues, type de rive, superficie intertidale à l'intérieur de 50 m, longueur du segment, occurrence de rats, superficie intertidale à l'intérieur de 1000 m), avait une excellente capacité de prédiction, évaluée par validation croisée répétée 10 fois (surface sous la courbe [AUC] = 0,89). La capacité de prédiction était plus faible lorsque le modèle était entraîné et testé avec des données indépendantes spatialement (AUC = 0,86) ou temporellement (AUC = 0,83). La distance à la forêt et la superficie de l'île ont eu l'effet le plus grand sur le modèle (influence relative [RI] = 41,5 % et RI = 36,7 %, respectivement); nous pensons que ces variables sont reliées à l'évitement des prédateurs. Les graphiques de dépendance partielle ont révélé que les couples nicheurs avaient tendance à se trouver plus près de la limite forestière, sur de petites îles, avec une exposition importante aux vagues, avec une superficie intertidale modérée, sur une rive rocheuse ou de gravier et sur des îles sans rats. Toutefois, les couples semblaient éviter les très petites îles et l'exposition très importante aux vagues, ce qui nous fait croire qu'ils évitent que leur nid ne soit emporté par les vagues. Nos résultats peuvent servir à orienter les activités locales de conservation et de gestion, à partir des cartes de prédiction, et à l'élaboration éventuelle d'un modèle à haute résolution (~100 m) pour prédire l'occurrence de couples nicheurs d'Huîtrier de Bachman à l'échelle régionale. De plus, les méthodes et les jeux de données SIG que nous avons élaborés peuvent être utilisés pour modéliser la répartition d'autres espèces côtières dans la région.


INTRODUCTION
The Black Oystercatcher (Haematopus bachmani) is a long-lived shorebird that breeds annually from May-August along the west coast of North America, from Baja California, Mexico, to the Aleutian Islands, Alaska, USA.Breeding pairs establish and aggressively defend territories, generally along rocky coastlines, and forage exclusively in the intertidal zone.Black Oystercatchers prey on a wide variety of invertebrate species, although limpets (Lottiidae spp.) and mussels (Mytilus spp.) make up the majority of their diet (Andres and Falxa 1995).
There is concern over the conservation status of the Black Oystercatcher (Tessler et al. 2014) because of relatively restricted habitat requirements (Tessler et al. 2014), small population size (8300-12,500; Andres et al. 2012), vulnerability to a number of coastal environmental threats, e.g., oil spill, introduced predators, human disturbance, and ocean acidification (Sloan and Bartier 2006), and expected contraction of breeding range due to climate change (31% by 2080; Langham et al. 2015).There is also interest in monitoring Black Oystercatcher breeding populations because of their role as indicators of overall intertidal ecosystem health (Sloan andBartier 2006, Carlson-Bremer et al. 2010) and recovery of the endangered Northern Abalone (Haliotis kamtschatkana; Bergman et al. 2013).
To predict vulnerability to climate change impacts, to identify important areas for conservation activities, i.e., monitoring or protection, and to improve understanding of the conservation status of the Black Oystercatcher, there is a need for improved breeding distribution models ("geospatial habitat models," "geospatial description of good-to-optimal breeding habitat") as a basis for prediction (Tessler et al. 2014, Weinstein et al. 2014).These models may be referred to broadly as species distribution models (SDMs).SDMs statistically or physiologically relate species occurrence data to environmental covariates (predictors) to predict species distributions (Guisan and Zimmermann 2000) and are increasingly being used as tools for conservation management in coastal environments (e.g., Lindegarth et al. 2014).To date, no SDM has been developed to predict distribution of Black Oystercatcher breeding pairs at a landscape scale.
Several studies have compared habitat features of Black Oystercatchers measured in situ at known breeding sites vs. random sites (e.g., Vermeer et al. 1992), or have investigated the relationship between reproductive success and habitat quality (e.g., Hazlitt and Butler 2001).However, lack of environmental data covering unsampled shorelines in these studies precludes prediction.McFarland (2010) compared remotely sensed habitat features at known breeding sites vs. random sites in Alaska, but did not generate predictions at unsurveyed shoreline.In California (Weinstein et al. 2014) and Alaska (A.Poe and B. A. Andres, 1999, unpublished manuscript), mean breeding density at different shoreline types was calculated to estimate population size and identify suitable habitat within a GIS.The British Columbia Marine Conservation Analysis program collated and mapped observed breeding pairs from various historical and current surveys at a regional scale, but these maps (available online) do not provide information at unsurveyed shorelines.Finally, probability of observation models developed by the Breeding Bird Atlas of British Columbia predict breeding Black Oystercatcher distribution from a set of distal predictors (latitude, longitude, aspect, elevation, slope) across B.C. at 10 km resolution, a scale generally not useful for landscape-level conservation management (Richmond et al. 2015).
Here, we present a SDM for predicting Black Oystercatcher breeding pair occurrence at a landscape scale (i.e., 10-200 km extent, 100 m resolution; Wisz et al. 2013) in southeastern Haida Gwaii, British Columbia, Canada.British Columbia is estimated to hold 10-20% of the global breeding population (Tessler et al. 2014), approximately one-third of which are thought to inhabit Haida Gwaii (Harfenist et al. 2002).The study area serves as a useful location to test various aspects of model and predictor performance because of availability of a rich set of occurrence and environmental data.Also, because the habitat is relatively undisturbed and in the core of the Black Oystercatcher range, the SDM should provide insights into its breeding habitat requirements throughout much of its range (within the limits of sampled environmental space).Prior to model fitting, we developed a conceptual model for breeding pair occurrence based on a review of relevant literature on breeding biology and habitat; feedback was received from several experts and local practitioners (Dalgarno 2016).This feedback informed our predictor selection, model interpretation, and improved our understanding of the model's limitations.In general we expect that important exogenous factors influencing breeding pair occurrence are either directly or indirectly related to (1) prey abundance and availability, and ( 2) reproductive failure and mortality of adults.Endogenous and observer processes influencing observed occurrence are discussed, although not modeled explicitly.
The main objectives of this study are the following: (1) to test the use of several GIS-derived and remotely sensed predictors to predict breeding pair occurrence at a landscape scale; (2) to identify important species-environment associations through interpretation of species-response curves and relative influence of individual predictors on model performance; (3) to evaluate performance (discrimination) and spatial and temporal transferability of the model; and (4) to predict distribution at unsurveyed shoreline in southeastern Haida Gwaii.

Study area
The distribution of Black Oystercatcher breeding pairs was predicted on 1423 km of shorelines in southeastern Haida Gwaii, an archipelago located approximately 100 km west of the British Columbia mainland (Fig. 1).Most shoreline lay within Gwaii Haanas National Park Reserve, National Marine Conservation Area Reserve, and Haida Heritage Site (Gwaii Haanas) or K'uuna Gwaay Conservancy and Haida Heritage Site (K'uuna Gwaay; also known as Laskeek Bay).The study area was confined to the Hecate Strait ecosection (Zacharias et al. 1998), as far north as Cumshewa Inlet.These boundaries were chosen to avoid extrapolation into unsampled geographic and environmental space: most notably, higher wave exposures and different oceanographic and climatic conditions west of the study area and increasing human disturbance north of the study area.The Hecate Strait ecosection is characterized by shallow to photic water depth (over 20% 0-20 m), high wave exposure, high subsurface relief, low current speeds, and a mixture of sand and hard bottom substrate (Zacharias et al. 1998).Within the study area, the shoreline is primarily bedrock (38%) and mixed bedrock/sediment (26%; Howes et al. 1994).Of 846 islands in the study area, 80% are less than 1 ha, 32% are less than 0.1 ha, and 28% are forested.Dominant wind directions recorded at Cumshewa Island, the nearest weather station, are southeasterly and, to a lesser extent, northwesterly.

Model unit creation
We developed a loosely coupled, vector-based (polyline) modeling approach.Methods used to create model units were informed by the need to integrate ShoreZone attributes and other GIS data sets.ShoreZone used 1:5000 aerial video captured in 1991-1992 to create 1:50,000 scale digital physical shoreline maps delineating discrete, systematically described "units" of homogenous shoreline type (34 coastal classes).Units were also assigned a biological wave exposure class (six ordinal classes), and a designation of "absent," "patchy," or "continuous" for "biobands" of highly visible species or species assemblages (for details on how qualitative ShoreZone classes were defined see Sloan and Bartier 2006).To maintain the boundaries of ShoreZone units whilst creating a model unit at higher and more consistent resolution (ShoreZone units range from 10 m to 13,000 m in length), we split ShoreZone units into "segments" with a length as close to 100 m as possible using an automated procedure in a GIS (e.g., a "unit" of 321 m was split into 3 segments of 107 m).The model unit size was chosen to approximate mean Black Oystercatcher breeding territory size (Nysewander 1977, Hazlitt 2001).All occurrence and environmental data, which were of various data types (Table 1), were aggregated to each model unit in a GIS.

Breeding pair occurrence
Occurrence data were compiled from two available surveys completed 22-29 June 2005 and 6-10 June 2010 in two separate regions, by Canadian Wildlife Service (CWS) and Laskeek Bay Conservation Society (LBCS), respectively (Fig. 1).This timing coincides with a period of relatively minimal permanent breeding pair ingress or egress as territories are usually established mid-May and fledging occurs July-August; despite difference in survey date, the use of both data sets was therefore justified (Vermeer et al. 1992, Hazlitt 1999, Hipfner et al. 2012).Of 1423 km of shoreline within the study area, LBCS and CWS surveys covered approximately 164 km (12%) and 381 km (27%) in the northern (Juan Perez sound and Laskeek Bay) and southern (south of Juan Perez) parts of the study area, respectively.Note that survey routes covered both poor and good quality habitat and were completed prior to commencement of rat eradication efforts in Gwaii Haanas.
In both CWS and LBCS surveys, at least two observers scanned the shoreline by boat for territorial Black Oystercatcher breeding pairs.Boats remained approximately 50 m from land and travelled at a speed of less than 11 km/hr.Surveys were not conducted in rainy (except in light rain) or windy/choppy (> 15 knots) conditions to improve detectability.Birds were deemed territorial if they were observed to exhibit two or more key behavioral criteria: sentry behavior, e.g., one or both birds sitting in a location within the territory where they have a good view of surroundings; alarm behavior, e.g., crouching, scurrying, to draw attention away from nest; group size, e.g., commonly as a pair, particularly when alarmed; location, e.g., clearly associated with potential breeding habitat.Upon finding a territorial pair, shoreline was surveyed by foot to search for evidence of a scrape, chicks, or eggs.Any areas previously occupied by a breeding pair were also searched on foot.In CWS surveys, more poor quality habitat was surveyed, e.g., sand beaches, although these sections were scanned from a Table 1.Predictors hypothesized to influence Black Oystercatcher (Haematopus bachmani) breeding pair occurrence at shoreline segments, selected a priori from a conceptual model and constrained by data availability.Scale of Aggregation: original resolution and geometry of the data before aggregated to shoreline segments; Retained: Indicates whether predictor was retained following pairwise correlation analyses (if rho > 0.7; the least ecologically relevant predictor was removed).During surveys, GPS coordinates were taken at the location of a territorial pair or nest-site.GPS coordinates that were not available for 11 of 91 sites in CWS surveys were determined post hoc from satellite imagery and field notes.Coordinates were imported into a GIS and assigned to the nearest shoreline segment (see below) using a tool within ArcGIS 10.2.Any shoreline segment with at least one territorial breeding pair was deemed present.Any shoreline segment that was surveyed but did not have a territorial breeding pair present was deemed absent.In 12 cases, breeding pairs were "moved" to an incorrect segment.This occurred when ShoreZone shoreline resolution was too low to accurately represent small islets.The actual nest site location, i.e., small islet, was visible from high-resolution satellite imagery, but was absent in the ShoreZone data.These cases were removed prior to analysis.

Environmental data
Fourteen potential predictors were selected a priori (Table 1), following development of a conceptual model (Dalgarno 2016).Predictors were (1) linked as directly as possible to some known or hypothesized biological mechanism; (2) appropriate to the scale of study; and (3) available for the study area at sufficient quality, coverage, and resolution.
A quantitative wave exposure index (Fetch) was available from the haidawave R package, which was codeveloped by the lead author of this study (Dalgarno and Thorley 2017).This index, which uses algorithms developed by Murtojärvi et al. (2007), is the sum of fetch calculated at 10 radiating lines for each point spaced 10 m along the shoreline.

Statistical analyses
All statistical analyses were conducted in R version 3.4.1 (R Development Core Team 2017).Results may be reproduced from publicly available code archived within a Github repository (Dalgarno 2017).We fit models using boosted regression trees (BRTs), a machine-learning ensemble method that creates and averages many classification trees (Breiman 2001)  Model fitting and prediction was performed using the gbm package (Ridgeway 2006) and with several functions in the dismo package (Hijmans et al. 2012).Three model parameters are user-defined: learning rate specifies the rate at which the contribution of each subsequent tree to the overall model shrinks; tree complexity specifies the number of nodes per tree, which in effect controls the maximum level of interaction between predictors; and bag fraction determines the proportion of observed data randomly selected for each tree.Altogether, these parameters determine the number of trees used to fit the model and can be adjusted to optimize model performance or, if necessary, limit computation time.Parameters were generally set according to recommendations in Elith et al. (2008).For all models, we used a bag fraction of 0.5 and maximum tree complexity of 3. Because any bag fraction less than one introduces model stochasticity (De'ath 2007), all model summary statistics, e.g., evaluation or interpretation, were calculated from the mean of 10 model iterations.Learning rate was adjusted until the optimal number of trees exceeded 1000, as calculated through a 10-fold cross validation procedure implemented with the gbm.step function in the dismo package.
Prevalence of breeding pairs was low, which led to an unequal balance of presence to absence segments (166:5220; 3%).Low prevalence (especially < 10%) reduces BRT model performance unless corrected (Barbet-Massin et al. 2012).We down-weighted absence data according to prevalence in the training set so that presence and absence data had equal weight in model fitting.
Species-response curves and relative influence scores were used to interpret the association between individual predictors and occurrence.Response curves were determined using partial dependence plots, which plot the effect of different predictor values on the response, while all other predictors are held at their mean (Elith et al. 2008).Relative influence (RI) of each predictor was determined following a method outlined in Friedman (2001).
Each time a predictor is selected for splitting, the squared improvement on the model is summed, and averaged across all trees.This value is then normalized so that all predictor RI scores sum to 100, with higher numbers indicating higher influence (Elith et al. 2008).Because model unit size was variable, segment length (SegLength) was included as a model predictor to test its influence.
Several procedures were implemented to reduce risk of model overfitting.First, the aforementioned 10-fold cross validation procedure in the gbm.step function determined the optimal number of trees to use before predictive performance on out-ofbag data, i.e., 10% of data withheld for testing, declined.Second, if any predictors had pairwise Spearman's correlation coefficient (rho) greater than 0.7, the predictor deemed least ecologically relevant or accurate was removed (Dormann et al. 2013; Table 2).This resulted in the removal of three predictors (BioExp, IT200, IT500); all retained predictors had rho less than 0.6.Finally, noninformative predictors were removed using the gbm.simplify function, within the dismo R package.In this procedure, the model is progressively simplified by dropping the least important predictor and refitting models until predictive performance (average CV error from 10-fold CV) is less than the original model (see Elith et al. 2008 for details).This procedure was run for each of 10 model iterations and the final predictor set excluded any predictors that were removed in five or more iterations.We assessed the ability of models to discriminate between observed presence and absence using Area Under the receiver operator characteristic Curve (AUC: Fielding and Bell 1997).AUC does not require selection of a threshold to reclassify continuous predicted values into binary presence/absence.In models with good discrimination, higher predicted values will tend to be associated with observed presence and lower predicted values will tend to be associated with observed absence, with minimal overlap (Lawson et al. 2014).AUC value of 1 indicates perfect discrimination and AUC value of 0.5 indicates that the model performed no better than random.AUC of 0.9 indicates excellent, 0.8-0.9very good, 0.7-0.8satisfactory, and < 0.7 indicates poor discrimination ability (Hosmer and Lemeshow 2000).All reported AUC values were derived from 10-fold crossvalidation.
Evaluation of a model's ability to predict independent data, i.e., data not used for model training, is generally recommended when sufficient occurrence data are available (Araújo and Guisan 2006).
We used three methods to assess predictive ability (indicated by AUC): (1) we trained the model on all data and tested predictive ability using 10-fold cross-validation; ( 2) we trained the model on 2005 occurrence data (~75% of segments) and tested predictive ability on 2010 occurrence data; and (3) we trained the model on the northernmost 75% of presence and northernmost 75% of absence segments and tested predictive ability on the remaining 25% of segments in each category.Finally, spatial autocorrelation of model residuals was tested using Moran's I, calculated with the lctools R package (Kalogirou 2017).

RESULTS
A  3 and Table 4).

Predictive maps
In Figure 4, we present a map of predicted probability of occurrence, with values generated from the final model (Panel A).
As an example, we also show predicted probability of occurrence on Bischof Islands (Fig. 4  white dots.From these enlarged areas, it is clear that the model correctly predicted low density of breeding pairs on Murchison and Faraday mainland and high density on Bischoff's Islands.As well, the model generally correctly predicted high occurrence on small, rocky offshore islands with higher wave exposure.

Interpreting predictions
The main output of this study was a map of predicted probability of occurrence of Black Oystercatcher breeding pairs at ~100 m shoreline segments in southeastern Haida Gwaii.Segments approximate mean territory size.Probability of occurrence, which is on a continuous scale (bounded by 0 and 1), can be converted to predicted presence/absence using a threshold, although this may lead to an unnecessary loss of information (Guillera-Arroita et al. 2015).If predictions of presence/absence are required, choice of threshold should depend on the application (Fielding andBell 1997, Lawson et al. 2014) because different thresholds will maximize different aspects of model performance (Liu et al. 2005).

Species-environment associations
In this study we aimed to test the ability of several GIS data sets, hypothesized to be proxies for some underlying biological mechanisms, to predict breeding pair occurrence.Although causation cannot be determined from correlative models, speciesresponse curves may inform biological hypotheses that can be tested through experimentation (Dormann et al. 2012).Speciesresponse curves identified here closely matched what was expected based on literature reviewed.
In general, predictors hypothesized to be associated with nest failure/mortality (IslandArea, TreeDist, RatStatus), not prey abundance/availability (Fetch, IT50, IT1000, ShoreType), had higher influence on the models.Tendency for breeding pairs to occur at small islands, far from treeline has been observed previously in Haida Gwaii (Vermeer et al. 1992) and elsewhere (Hazlitt 2001, McFarland 2010).We suggest that these predictors act as proxies for occurrence of a wide variety of predators known to cause nest failure or mortality, including Peregrine Falcon (Falco peregrinus; Hipfner et al. 2012); Gulls (Larus glaucescens; Hartwick 1973, Nysewander 1977), Crows (Corvus caurinus; Hartwick 1973, Nysewander 1977, Hipfner et al. 2012), black bears (Ursus americanus), Common Ravens (Corvus corax), wolverines (Gulo gulo), and Bald Eagles (Haliaeetus leucocephalus; Morse et al. 2006).We expect that breeding pairs avoid nesting in areas that increase their vulnerability to nest failure through the process of natural selection.They are also able to adapt habitat use within several years of environmental change, as revealed by experimental removal of predators and human disturbance from islands (Ainley and Lewis 1974, Nysewander 1977, Byrd et al. 1997, Croll et al. 2015).
Nest failure may also be attributed to washout, i.e., from waves (Vermeer et al. 1992), extreme high tides (Morse et al. 2006), and severe storms (Hartwick 1973, Nysewander 1977).We found some evidence to support the hypothesis that these factors may be driving breeding pair occurrence.First, breeding pairs avoided very small islands (< 1500 m²), likely because these were inundated at high tide, or were more vulnerable to inundation from extreme tides or waves.However, it should be noted that very small islets were not represented by the shoreline data set and any breeding pairs that occurred on these were removed from the occurrence data set.Thus, the apparent avoidance of very small islets should be further examined.Second, breeding pairs tended to occur at moderate wave exposures (Fetch), with a sharp decline in predicted probability of occurrence at extreme exposures.Response to wave exposure appears to reflect a trade-off between prey, which is more abundant (Dalgarno 2016) and available, i.e., from gaping mussels (Falxa 1992) at high exposures, and vulnerability to nest washout or risk of injury at the most extreme exposures.For example, Falxa (1992) observed foraging adults avoiding sites at the highest exposures, which he attributed to avoidance of injury by pounding waves (Falxa 1992).To our knowledge, no previous studies have previously identified these nonlinear relationships, i.e., tendency to avoid very small islands and very high exposures.
Breeding pairs generally preferred rock or gravel shoreline types that were wide, and generally avoided sand, mixed sand/gravel, and narrow shoreline types.A tendency for breeding pairs to occur on rocky or gravel shoreline types has been observed throughout its range (Andres 1998, Poe et al. 2009, Weinstein et al. 2014).However, avoidance of wide rock with gravel and narrow rock with sand and gravel shoreline types observed in this study did not follow this general pattern.Prey abundance is likely to be higher at rocky shore types and at low exposure gravel sites (Dalgarno 2016), which we hypothesize to be the reason for preference of these shoreline types.Wide shoreline types are also likely to provide more habitat for prey and facilitate foraging, because Black Oystercatchers cannot forage on vertical surfaces (Lindberg et al. 1998), and breeding pairs are able to deliver more prey to chicks (Hazlitt et al. 2002).
Similarly, we hypothesize that increasing probability of occurrence with increasing IT50 values (intertidal area within 50 m circular buffer) may be explained by prey abundance and prey availability (Lindberg et al. 1998, Hazlitt et al. 2002).The observed decline in probability of occurrence at extreme values of IT50, i.e., very flat areas with > 75% intertidal area, may be attributed to correlation with shoreline type, i.e., estuary, lagoon, or wide sandy beach.
Better performance of the IT50 predictor over IT1000 (intertidal area within 1000 m) provides some evidence that foraging occurs primarily within close proximity to nest-sites, i.e., within a breeding territory, despite observations of foraging adults travelling up to a kilometer from breeding territories (Hartwick 1978).
There is particular interest in the effect of introduced rats on breeding Black Oystercatchers in Haida Gwaii, which has seen eradication of rats from Langara Island (Taylor et al. 2000), and more recently, several islands within Gwaii Haanas (Night Birds Returning Project, unpublished data).Although it is generally assumed that rats depredate Black Oystercatcher chicks and eggs (Kurle et al. 2008, Gruman 2013), the effect of rats on Black Oystercatcher breeding pair occurrence is not well known.Eradication of rats on Hawadax Island, Alaska, led to an increase in breeding population from one pair to six pairs over five years (Croll et al. 2015).Within our study area, Gruman (2013) demonstrated that breeding pair density is lower on islands with rats, although that study did not take into consideration other habitat qualities.
In this study, the influence of rat occurrence (RatStatus) was relatively low.There may be several reasons for the apparent lack of influence of rat occurrence on the model: at the scale of this study, TreeDist may act as a general proxy for occurrence of all predators, including rats; Black Oystercatchers may still occur on rat-infested islands, i.e., out of necessity or ignorance, but because our model does not account for reproductive success, we cannot observe and model the detrimental effect of rats, i.e., mismatch between reproductive success and habitat selection (Arlt and Pärt 2007).It is also possible that the quality of our rat occurrence data set was poor.Further studies may look at change in breeding population following rat eradication in our study area, although these data are not yet published.

Observation bias and territoriality
Models that explicitly account for the observation process are generally called occupancy models.In this study, we could not assume perfect detection probability (Lyons et al. 2012), nor that detection probability was not correlated with a predictor of occurrence.Thus, model predictions may be biased (MacKenzie et al. 2002, Guillera-Arroita et al. 2015).In repeated boat-based count surveys of Black Oystercatcher breeding pairs in Washington, detection probability was 0.75 (95% BCI: 0.42-0.91;Lyons et al. 2012).Thus, two or more repeated surveys on each shoreline segment would increase detection probability.However, repeating surveys is costly given minimal resources typically available for monitoring activities, and thus the required data for occupancy modeling is rarely available.
In this study, only 20% of shoreline was surveyed twice because of logistical constraints.However, even if all shoreline had been surveyed twice, the length of time between each survey (about a month) may have violated assumptions of occupancy modeling: that repeated surveys occur within a closed population (MacKenzie et al. 2002).Thus, although we acknowledge potential bias in our results given imperfect detection, we also suggest that research be undertaken to weigh the costs and benefits of collecting occurrence data suitable for occupancy modeling of Black Oystercatcher breeding pairs.
Another source of error may have been failure to adequately account for territoriality, which may explain moderate spatial autocorrelation in our model.Breeding pairs aggressively defend territories (Hazlitt 2001) and the study scale was chosen to approximately reflect mean territory size (Nysewander 1977, Hazlitt 2001).However, variability in territory size is large (0.79-1.84 ha ;Nysewander 1977;25-245 m length;Hazlitt 2001).Thus, on shorelines where model unit size is smaller than territory size, segments surrounding an observed breeding pair may have high predicted probability of occurrence, i.e., suitable habitat, but observed absence.An alternative approach to take in future studies may be to model breeding density in relation to features of islands (e.g., Heinänen and Von Numers 2009).As well, a sensitivity analysis could be conducted to determine the effect that changing model unit size has on model performance.However, it should be noted that modeling at the island scale or using larger average model unit size (e.g., ~200m) would lead to reduced accuracy of predictors.
Model unit size (SegLength) was retained in the final model, with probability of occurrence increasing on larger segments.However, we suggest that alternative modeling approaches would lead to reduced predictor accuracy (e.g., highest proportion of shore type rather than exact shore type) and that given its relatively low influence on the model (3%), this trade-off may not be justified.

Toward a regional model
We However, using methods developed here, and given good quality satellite imagery, shoreline and intertidal area data, predictors could be generated for other regions or broader extents at relatively low cost.We do not recommend collecting data on rat occurrence for the purpose of a Black Oystercatcher model because it is expensive to collect and influence on the model was low.
A regional model will require compilation of occurrence data from various surveys that have been conducted (historically or recently), typically from monitoring activities.There are several challenges associated with this requirement.First, it will be necessary to ensure that occurrence data adequately cover geographic and environmental space of prediction area.Although we show that model predictive ability on independent data was reasonably good (AUC > 0.85), model transferability may be an issue, especially in regions that have different environmental conditions.Moreover, our study demonstrated that predictive ability on temporally split data, i.e., trained on 2005 occurrence data, tested on 2010 occurrence data, was relatively poor.We suspect that this may have been caused by differences in survey methods/detection probability between surveys, or less likely, differences in habitat use between years.Temporal partition of data was also spatially partitioned (2005 occurrence data was collected in southernmost 75% of study area), so it was difficult to determine whether poor transferability could be attributed to different habitat use in the southern part of the study area, which is dominated by the presence of Moresby Island, and is generally less accessible.In general, we demonstrate that transferability may be an issue when occurrence data sets are compiled from multiple years/organizations.
Second, compiling occurrence data from historical monitoring activities may be challenging.In this study, we compiled survey data collected by two organizations: CWS and LBCS.For both data sets, nest locations in the form of GPS coordinates were readily available (as well as some absence data in the form of GPS coordinates from CWS).However, the survey route, which was needed to determine absence, was not immediately available and was determined by browsing field notes, reports, and through direct communication with the organizations.The use of historical survey data sets may be difficult if data or metadata are lost, especially in the event that data are used for purposes that they were not originally intended for (as in this study).Our study generally supports calls to digitize and make freely available historical and current ecological data sets, which are collected carefully and often at great cost (Hampton et al. 2012).
Finally, model predictors will have to be selected based on the study scale or region of interest.For example, water temperature and salinity may have greater influence on intertidal species distributions at broader scales (Raffaelli andHawkins 1996, Nyström Sandman et al. 2013).As well, tendency of breeding pairs to occur at different shoreline types, e.g., rocky vs. gravel, may vary geographically (Nysewander 1977, Byrd et al. 1997, Andres 1998, Weinstein et al. 2014) and different predators may have more or less influence on occurrence dependent on the region, e.g.foxes (Byrd et al. 1997), rats (Croll et al. 2015), raccoons (Vermeer et al. 1992), etc.Finally, although our study, Morse et al. (2006), andPoe et al. (2009) found no influence of low-level human disturbance on occurrence, human disturbance may be an important predictor of occurrence in regions outside of protected areas or on islands permanently inhabited by humans (Ainley and Lewis 1974, Vermeer et al. 1989, Andres and Christensen 2009).

CONCLUSION
The conservation status and landscape-level distribution of the Black Oystercatcher is largely unknown because of sparse survey data.The need to develop breeding pair distribution models has been identified as a key research priority by leading experts (Tessler et al. 2014).To our knowledge, this study is the first to develop a SDM for predicting probability of occurrence of breeding pairs at unsurveyed shoreline, at a scale relevant to landscape-level management.This modeling approach may be applied to other regions where adequate environmental and occurrence data exist.Further, our coastal SDM framework, which integrates ShoreZone attributes and novel GIS data sets, may be generally useful for predicting distribution of other coastal species in Haida Gwaii and B.C. more broadly.

Fig. 1 .
Fig. 1.Study area.Shoreline surveyed in 2010 is shown in red (Laskeek Bay and Juan Perez); shoreline surveyed in 2005 is shown in blue (Gwaii Haanas, south of Juan Perez); and unsurveyed shoreline over which breeding pair occurrence was predicted is shown in black.Inset shows study area in relation to Haida Gwaii and British Columbia mainland.

Fig. 2 .
Fig. 2. Partial dependence plots for continuous predictors retained in final model.Partial dependence plots show the effect of a given predictor on the response, with all other predictors held at their mean.Rug plots at the top show the distribution of surveyed segments across that predictor.

Fig. 3 .
Fig. 3. Partial dependence plots for categorical predictors retained in final model.Partial dependence plots show the effect of a given predictor on the response, with all other predictors held at their mean.

Fig. 4 .
Fig. 4. Predicted probability of occurrence of breeding pairs in southeastern Haida Gwaii.Insets show locations of observed breeding pairs (white points) relative to predicted probability of occurrence at sites of interest (Bischof Islands, Panel B; Murchison and Faraday Islands, Panel C; Swan and Bolkus Islands, Panel D).

Table 2 .
Pairwise Spearman correlation coefficients (rho) for ordinal and continuous predictors.Pairwise predictors with rho > 0.7 are shown in bold.

Table 3 .
Range and median for continuous predictors retained in final models over different extents of study area.Values are scaled for readability.Surveyed 2005: shoreline surveyed in 2005 (south of Juan Perez) by Canadian Wildlife Service; All Surveyed: all surveyed shoreline including 2005 and 2010 survey conducted by Laskeek Bay Conservation Society in Juan Perez and Laskeek Bay; Study Area: includes all surveyed shoreline and unsurveyed shoreline within the study area (southeastern Haida Gwaii).

Table 4 .
Percent shoreline for each shoreline type (ShoreType) over different extents of study area.Surveyed 2005: shoreline surveyed in 2005 by Canadian Wildlife Service (south of Juan Perez); Surveyed 2010: shoreline surveyed in 2010 by Laskeek Bay Conservation Society (Juan Perez and Laskeek Bay regions); All surveyed: all surveyed shoreline combined; Study Area: all shoreline in study area including surveyed and unsurveyed, i.e., predicted, shoreline.

Table 5 .
AUC of final model over different evaluation methods.Values are the mean over 10 model iterations.

Table 6 .
Relative Influence (RI) of predictors in final model.Values are the mean over 10 model iterations.
developed this model with the intention of providing a modeling framework for predicting Black Oystercatcher breeding habitat in Haida Gwaii, B.C., Canada, and to test a set of predictors that may be used in other regions, or at broader extents.Of the predictors retained in the final model, only ShoreType is currently available with B.C.-wide coverage.The Fetch predictor (and additional fetch-based wave exposure indices) is available for Haida Gwaii, and all remaining predictors, which were generated for use in this study, are only currently available for the study area.