Establishing the adequacy of recorded acoustic surveys of forest bird assemblages

The use of programmable acoustic recorders to survey forest birds is increasing owing to a range of advantages over surveys conducted by human observers. Users of these devices require a methodological framework for designing and testing a proposed survey protocol in context, to be assured that it has the capacity to efficiently meet the requirements of their study. We aimed to demonstrate how a potential acoustic survey protocol may be effectively tested by comparison with either (1) an observer-based method using species detection probabilities, or (2) the survey completeness levels among a set of other candidate acoustic protocols. Surveys using acoustic recordings (manually processed) and standardized area searches were conducted over the same period in dry sclerophyll forests of southeastern Australia. A multispecies occupancy modeling framework was used to obtain estimates of the probability of detecting individual species for both standardized searches and an acoustic protocol representing similar temporal sampling effort. Detection probabilities for 73% of species recorded using these methods were greater with the acoustic protocol than standardized searches, which established its adequacy for particular research questions. The survey methods resulted in a similar pattern of detection probabilities for foraging stratum guilds, although members of the canopy/subcanopy guild were less likely to be detected using both methods. Survey completeness (species detected/total species) was adopted as an alternative framework for acoustic protocol evaluation. The complete acoustic data set was (1) used with the incidence-based coverage estimator (ICE) to obtain the total number of species, and (2) subsampled to produce a candidate set of potentially useful survey protocols. Completeness levels ranged from 62% to 73% in the set, which provided options for subsequent protocol selection. Other ecologists may adopt one of the frameworks to establish the adequacy of their own acoustic survey protocol to suit their research question and available resources. Évaluation de l'exactitude d'enregistrements acoustiques d'assemblages d'oiseaux forestiers RÉSUMÉ. L'utilisation grandissante d'enregistreurs acoustiques programmables pour inventorier les oiseaux forestiers s'explique par les avantages de cette méthode par rapport aux inventaires réalisés par des observateurs humains. Les utilisateurs de ces appareils ont besoin d'un cadre méthodologique pour concevoir et tester un protocole d'inventaire proposé, afin que celui-ci réponde correctement à leurs besoins. Nous avons cherché à démontrer de quelle façon un protocole d'inventaire acoustique potentiel pouvait être testé efficacement en le comparant avec soit (1) une méthode fondée sur l'observateur et des probabilités de détection des espèces, soit (2) le niveau d'exhaustivité de l'inventaire parmi une série d'autres protocoles acoustiques possibles. Des inventaires réalisés au moyen d'enregistrements sonores (traités manuellement) et des recherches par secteur normalisées ont été effectués à la même période dans des forêts sclérophylles sèches du sud-est de l'Australie. Nous avons utilisé un cadre de modélisation de la présence multiespèces pour obtenir des estimations de la probabilité de détection de chaque espèce, pour les recherches normalisées de même qu'un protocole acoustique représentant un effort d'échantillonnage temporel similaire. La probabilité de détection de 73 % des espèces rapportées par ces méthodes était plus élevée avec le protocole acoustique comparativement aux recherches normalisées, ce qui a démontré l'efficacité des enregistreurs pour des questions particulières de recherche. Les deux méthodes d'inventaire ont obtenu des résultats similaires de probabilité de détection pour les guildes d'oiseaux se nourrissant à diverses strates forestières, bien que les espèces appartenant à la guilde de l'étage supérieur/sous-supérieur avaient moins de chance d'être détectées par les deux méthodes. L'exhaustivité de l'inventaire (espèces détectées/total des espèces) a été choisie comme cadre alternatif pour l'évaluation du protocole acoustique. Le jeu de données acoustiques complet a été (1) utilisé à l'aide de l'estimateur de couverture fondé sur l'incidence pour obtenir le nombre total d'espèces, et (2) sous-échantillonné pour produire une série de protocoles d'inventaire potentiellement utiles. Les niveaux d'exhaustivité s'échelonnaient de 62 % à 73 % dans cette série, offrant des options pour la sélection subséquente d'un protocole. Nous invitons d'autres écologistes à utiliser un des cadres pour établir l'exactitude de leur propre protocole d'inventaire acoustique, selon leur question de recherche et les ressources disponibles.


INTRODUCTION
Birds can be considered indicators of ecosystem integrity (Wimmer et al. 2010), so understanding their fine-scale distributions increases knowledge about species habitat requirements, capacity to tolerate disturbance, and wider community resilience. Reliable and efficient methods to measure bird diversity are necessary to provide understanding of the effects of changing global climate and increasing anthropogenic pressures on forest ecosystems. Cost-effective methods are required because governments worldwide have reduced their financial support for the management of protected areas, which makes monitoring and maintaining biodiversity more difficult to achieve (Watson et al. 2014). Incorporation of remotely deployed sensors into methods to capture species data in large natural areas is increasing as technology develops and becomes more available (Shonfield and Bayne 2017), while the cost of replicated surveys conducted by an observer in such areas remains high (Darras et al. 2019). In general, birds are aurally conspicuous, which renders them excellent subjects for the development of acoustic methods using devices left unattended to record in the field (Shonfield and Bayne 2017). However, to expand the capacity of acoustic recorders to survey forest bird assemblages, suitable frameworks are required for the systematic development and testing of efficient methods that incorporate the high accuracy of manual processing of recorded sound files.
There are several advantages to using acoustic recording methods to survey birds. A minimum of two short visits to a site are required to deploy and later collect a recorder, so there is scope in study designs to reduce the number of trips to sites and time spent in the field overall. For large studies (e.g. Furnas and Callas 2015), a set of recorders may be progressively relocated until many sites have been surveyed. Field personnel responsible for deployment of acoustic recorders do not require specialized ornithological skills (Hobson et al. 2002), which provides more staffing options for field data collection. When combined with potential for saving time in the field, the capacity to program an acoustic recorder to capture an extended temporal sample means that recorders can be efficiently deployed in remote areas, where access may not be straightforward (Shonfield and Bayne 2017). Furthermore, a researcher may incorporate the flexibility of being able to draw post-hoc samples from long duration recordings into a study design. It is possible to derive several different avian response variables from acoustic recordings (Darras et al. 2019). In addition to species richness (e.g., Wimmer et al. 2013), abundance or density can be estimated using data collected with a single recorder (e.g., Van Wilgenburg et al. 2017, Sebastián-González et al. 2018, Bombaci and Pejchar 2019. Once deployed, an acoustic recorder can collect large amounts of data over time while unattended at a location, but the resulting sound recordings need to be processed to extract the species data. Although automated processing methods are available, manual processing of sound files is currently the most reliable and accurate option when the aim is to detect all species present in recordings (Wimmer et al. 2013, Shonfield and Bayne 2017, Venier et al. 2017. Manual processing involves systematically opening sound files in software that enables the analyst to listen to bird calls whilst viewing them represented in a spectrogram of frequency over time. Species detected in this way can be recorded for each sampling unit. Accounting for all recorded species while manually processing recordings can be time consuming because, for example, short sections of complex sound files of the dawn chorus or diverse sites may need to be replayed, perhaps alternately with reference calls. However, the time is spent conveniently at the desktop and the permanent record of a survey in the form of a sound file provides a means to obtain high accuracy in the detection of species (Shonfield and Bayne 2017). Reductions in processing time can be achieved if the analyst is able to make instantaneous visual appraisals of vocalizations on simpler spectrograms, without complementary audio (Truskinger et al. 2013).
Manual processing of large quantities of recorded acoustic data can yield comprehensive assessments of bird assemblages, but the level of resources required typically render this approach unfeasible and/or too costly (Balestrieri et al. 2017). Studies seeking to use manually processed recordings to survey forest bird assemblages require an efficient and proven method that is tailored to the requirements of the study. The development of such a method would ideally take place within a robust framework that is flexible enough to accommodate a wide range of forest types and research aims. In this paper, we demonstrate and evaluate two frameworks that other researchers may consider for use as alternative approaches to producing their own acoustic survey method to efficiently collect and process field data, or retrospectively subsample large amounts of recorded acoustic data.
Given records of species detected in acoustic recordings obtained in a set of sites, with replicate sampling for each site, hierarchical multispecies occupancy modeling can be used to simultaneously estimate probabilities of species occurrence and detection (Darras et al. 2019). This approach attempts to account for imperfect detection, i.e., the chance that a species occupied a site during a survey visit, but was not detected. Probability of detection can vary widely among the bird species that make up an assemblage and can be influenced by numerous factors including bird appearance and behavior, site attributes, and the survey method used (Iknayan et al. 2014). Validation of prospective survey protocols using acoustic recordings can be achieved by running them in parallel with an established method conducted by an observer and comparing the results (Wimmer et al. 2013, Darras et al. 2018a. We used occupancy modeling to estimate and compare the probability of detection of individual species and guilds between a potentially useful acoustic protocol and an observer-based method (e.g., Furnas and McGrann 2018).
An alternative framework for developing an adequate acoustic method involves assessment of the survey completeness levels of a set of candidate acoustic protocols. Survey completeness can be expressed as the number of species detected, divided by the estimated total number of species present (Watson 2017). The completeness levels of the set of candidate protocols may, for example, lie in the range of 70-90%, with a final method selected to suit the aims of a study (Watson 2010, Callaghan et al. 2017). High-level sampling completeness may be required for detailed studies of species ecology, perhaps moderate completeness when species numbers are required, and modest completeness can be adequate when assemblages are to be compared in relative terms (Watson 2010). To apply this approach, a set of candidate protocols can be devised that reflect different temporal Fig. 1. The study was conducted in natural landscapes of the central Blue Mountains. Acoustic recording and area search methods were carried out in 10 dry sclerophyll forest (DSF) sites to survey bird assemblages. In each site, an acoustic recorder was randomly located in the forest patch core area, which excluded a 100 m wide buffer zone internal to the site boundary.
arrangements of samples and levels of total sampling effort. The number of species detected in a site using each of these protocols can then be represented as a percentage of the total number of species (e.g., Wimmer et al. 2013).
Forests worldwide are exposed to changing climate, increases in direct human disturbances, and flow-on effects, such as altered fire regimes (e.g., Bradstock et al. 2014). Some of these forested areas are large, remote, and difficult to access. For these reasons, they are often important reservoirs of global biological diversity. Focused acoustic recording methods are highly suited to survey bird assemblages in these areas. The present study was conducted in an area subject to these threats and with these attributes. The Greater Blue Mountains World Heritage Area is a one million hectare, mainly forested, contiguous protected area in southeastern Australia. The overarching aim of the study was to demonstrate and assess two frameworks that other ecologists could use to establish the adequacy of their own recorded acoustic survey protocol, using manually processed data. Specifically, we aimed to demonstrate how a potential acoustic protocol may be effectively tested by comparison with either (1) an observer-based method using species detection probabilities, or (2) the survey completeness levels among a set of other candidate acoustic protocols.

Study sites
Ten replicate dry sclerophyll forest sites were established on the tops and upper slopes of sandstone ridges in the central Blue Mountains, southeastern Australia (Fig. 1). The landscape of the study area consists of a regular pattern of ridges interspersed with gullies at elevations grading from 500 m in the east to 850 m in the west. All sites had a shrubby understory and were last burnt by a wildfire in early 2002 (Office of Environment and Heritage NSW 2016). Commonly occurring tree species included scribbly gum (Eucalyptus sclerophylla), Sydney peppermint (E. piperita), red bloodwood (Corymbia gummifera), and Sydney red gum (Angophora costata), which formed a canopy of 15-25 m in height with 30-70% foliage cover. Shrubs included species of Banksia, Leptospermum, Persoonia, Isopogon, Petrophile, Epacris, and Lambertia. Area searches and acoustic recording were conducted during mid-2017.

Bird surveys Standardized area search
For the study, records of species present in sampling periods were obtained using both an observer-based method and acoustic recording. For the former, we used the standardized search (Watson 2003), which is built upon a 2-ha/20-min area search method that has been commonly used in Australian forests (Loyn 1986, Watson 2004). The 2-ha/20-min area search is also the most valuable survey method in Birdlife Australia's Atlas of Australian Birds and Birdata projects (https://birdata.birdlife.org.au/surveytechniques). In each site, individual birds seen, heard, or both seen and heard were recorded in timed 20-min periods, while the observer actively searched a 2-ha area within the site (Loyn 1986). The 2-ha areas were typically 100 x 200 m rectangles, but the shapes were allowed to vary to remain within the dry forest sites. All searches were conducted by the same observer (MF), who was experienced with the visual and aural detection of species occurring in the region.
Successive 20-min/2-ha searches were conducted for three hours following dawn for as many days as were required to satisfy a results-based stopping rule that was applied to survey forest sites equivalently (Watson 2003). The stopping rule was to cease sampling in a site when three consecutive 20-min searches yielded no new species (Watson 2004). The number of 20-min samples required before the stopping rule was triggered ranged from 7 to 15 among sites, resulting in an average of 9.8 periods/site. The starting point for the first search of each day was randomly selected, with subsequent 2-ha searches continuing directly on into new parts of the site, with the aim of actively searching throughout the site (Watson 2003). Sites were large enough to accommodate this approach, with the smallest being 14 ha. All samples used in this study were taken on days of no rain and no more than very light wind.

Acoustic recording
Acoustic recorders (Song Meter SM4, Wildlife Acoustics, Massachusetts, USA) were deployed to record on days immediately preceding or following standardized searches in each site. Recordings were not made on the same days as standardized searches in a site to eliminate the possibility of the presence of a moving observer affecting the behavior of vocal birds (Digby et al. 2013, Klingbeil and Willig 2015, Darras et al. 2018a. Recorders were randomly located on ridge tops but were excluded from a 100 m wide buffer zone internal to the site boundary (Holmes et al. 2014; Fig. 1). The assumption was that this functioned to limit, but not exclude, the recording of individuals calling from outside the site (Hingston et al. 2018). Increasing the buffer width was not possible owing to the consistent narrowness of ridges and associated ridge-top forest amongst sites (Fig. 1).
Because birds have higher detection probability in acoustic recordings from the first three hours post dawn (Wimmer et al. 2013), Song Meters were configured to record continuously for this period until five days of clear-weather recordings had been captured in each site. In this way, a complete data set consisting of a total of 900 min/site of stereo recordings (16-bit wav files) was obtained. Recorders were set to use a sample rate of 24,000 Hz, with no filtering, and the default gain of 16 dB (Venier et al. 2017, Hingston et al. 2018. Microphones had a signal to noise ratio of 80 dB (Darras et al. 2018a). Microphone sensitivity can decline to varying degrees with increasing time in the field (Turgeon et al. 2017), but testing following deployment showed that all microphones remained within the sensitivity range stated by the manufacturer when this type of microphone is new. Sound recorders were attached to a tree of no more than 110 mm diameter, at 2 m above the ground, with a cable lock (Depraetere et al. 2012, Darras et al. 2018a). Limiting tree diameter in this way meant that at least one of the two microphones on opposite edges of the recorder was unobstructed in a horizontal plane.
To identify bird species from recorded calls, acoustic recordings were systematically analyzed in processing software (Kaleidoscope Pro, Wildlife Acoustics, Massachusetts, USA). All recordings were manually processed by viewing spectrograms and replaying and listening to calls to identify species, with repetition as required (Hingston et al. 2018). This was carried out by a single person (MF) and involved recording all species that were present in 20min sampling periods. This sample duration had been used previously for point counts and acoustic recordings (Darras et al. 2018b) and was selected to match that of standardized searches, to maximize comparability. Several other measures were applied to ensure accurate identification of species. First, collections of recorded bird calls were referred to while processing (Van Gessell andKane 2002, Buckingham andJackson 2007). Second, a reference library of calls was developed from the data itself and used for ongoing verification of species identification (Wimmer et al. 2013). Finally, where there was any uncertainty about the species responsible for vocalizations, calls were given a unique code during processing, and additional expert opinion was sought (Celis-Murillo et al. 2012; see Acknowledgments).

Multispecies occupancy modeling framework A potential acoustic survey protocol
The complete acoustic data set was subsampled to devise a potentially useful acoustic survey protocol, for comparison with standardized searches using occupancy modeling. The acoustic protocol that we used consisted of five 20-min periods immediately following dawn for two consecutive, clear-weather days. This protocol was designed to match with standardized searches in terms of length of sample period (20 min), average number of samples per site survey (10 x 20 min), and the number of consecutive days over which a survey was conducted (two). This short site survey period reduced the overall duration of the study, which meant that the multispecies occupancy modeling assumption of a closed population was more likely to have been met (Iknayan et al. 2014).

Probability of detecting species and guilds
We used a Bayesian approach to model the number of detections y i,j,m of species j in site i under survey method m as a Binomial variable: (2) Where N i,m is the number of replicate samples of site i performed with survey method m; z ,i,j є {0,1} is the true, unknown occupancy of the site by species j; and p j,m is the probability of detection for this species using method m. This treats the detectability of a species using a given survey method as invariant across sites and replicate samples.
We modeled occupancy as a function of a species-site intercept drawn from a Normal distribution with a species-specific mean, plus a site intercept drawn from a standard Normal distribution. The site intercept allows for the possibility of general site effects that might increase or reduce the probability of occupancy across all species.
The probability of detection for species j using survey method m was represented in a similar way, with a species-method intercept drawn from a Normal distribution with a species-specific mean. Standard deviations of one were used for the Normal priors on the alpha and beta parameters. Northrup and Gerber (2018) warn j j against large prior standard deviations that can bias a logistic model toward extreme probability values. The overall prior standard deviation for the linear predictor of occupancy above is approximately 1.4, which accords with the value recommended by Northrup and Gerber. logit (Ψ i,j (1) (4) The model was fitted using Markov Chain Monte Carlo (MCMC) with JAGS version 4.3.0 (Plummer 2003) via the runjags package (Denwood 2016) in R (R Core Team 2019). We ran four chains with a burn-in period of 4000 iterations followed by a sampling period of 40,000 iterations with a thinning rate of 20. Model convergence was checked using the Gelman-Rubin statistic (Gelman and Rubin 1992) after which the separate chains were combined into a single matrix of 8000 samples.
For each species, we summarized the probability of detection under each of the two survey methods by calculating the interquartile range and central 90% of posterior distributions. We assessed the degree to which the acoustic protocol was better at detecting each species than standardized searches by calculating the proportion of posterior probabilities that were greater for the acoustic method. To assess whether there was any potential survey method bias in the probability of detecting species at different levels within the vertical forest strata, posterior probabilities of detection for species under each survey method were aggregated according to foraging stratum guild. Data and R code used in occupancy modelling are available online (https://github.com/ mfrnkln/birdsurvey1).

Survey completeness framework
Survey completeness, or the number of species detected as a percentage of the total number of species (Watson 2017), was adopted as a metric to evaluate and compare a set of potential acoustic survey protocols. Initially, estimates of the total number of species for each site were obtained using the incidence-based coverage estimator (ICE), which uses information about infrequently detected species to estimate unobserved species (Chazdon et al. 1998, Chao et al. 2000. These estimates (Appendix 1) were based on the complete acoustic data set for each site and were calculated in R using the SpadeR package (Chao et al. 2016). ICE has been shown to out-perform other established estimators of species richness in terms of accuracy, precision, and level of bias when recorded acoustic samples have been used (La and Nudds 2016, see also Chazdon et al. 1998). Then, a set of candidate acoustic survey protocols, which reflected a range of different temporal configurations of sampling days and 20-min periods, were developed by subsampling the complete acoustic data set. The completeness levels of each of these protocols was calculated for each of the 10 sites and then averaged.

RESULTS
Overall, 57 species were recorded, including seven species that were observed only once (Appendix 2). Of these seven species, only visual observation was made of the Satin Flycatcher (Myiagra cyanoleuca) and the White-throated Gerygone (Gerygone olivacea). The other five species were either seen and heard, or heard only. The Eastern Spinebill (Acanthorhynchus tenuirostris) was the most frequently detected species in both acoustic recordings and standardized searches. Other common species included the Pied Currawong (Strepera graculina), Brown Thornbill (Acanthiza pusilla), and Crimson Rosella (Platycercus elegans). Three nocturnal species were recorded, but they were omitted from data analysis, because they were not targeted by our methods (Appendix 2).
Each of the first five acoustic sampling periods following dawn resulted in the detection of a mean of approximately 11 species, with numbers of species declining in subsequent 20-min periods, for the complete acoustic data set (Fig. 2a). Similarly, the rate at which new species were detected diminished after the first five sampling periods (Fig. 2b). Accordingly, later subsampling of the complete acoustic data set to produce candidate survey protocols was restricted to samples from the first five sampling periods following dawn.

Fig. 2.
The mean (a) and mean cumulative (b) number of bird species detected in acoustic recordings in each 20-min sampling period for three hours immediately following dawn (bars 95% CI). Means were calculated using data from five days sampling in each of 10 sites.

Multispecies occupancy modeling
Detection probabilities obtained for species according to the acoustic protocol and standardized searches were spread across the entire range from zero to one (Fig. 3). When the probability of detecting a particular species using our acoustic protocol was ~25% or greater, the probability of detection was almost always much greater than for standardized searches. This is evidenced by the complete separation of the central 90% of posterior distributions, which was the case for almost half of all species detected by these methods. The detection probabilities according to the two methods were relatively similar for species with low detectability. Exceptions were the Jacky Winter (Microeca fascinans), Rose Robin (Petroica rosea), and Eastern Yellow Robin (Eopsaltria australis), which were more readily detected in standardized searches (Fig. 3). For 73% of species detected by these methods, at least 69% of posterior detection probabilities were greater for the acoustic protocol than standardized searches (Table 1).  Table 1.
The survey methods resulted in a similar pattern of detection probabilities for species that forage at all levels in the forest strata and those that mainly feed in the ground/understorey layers (Fig.   4). In both cases, the acoustic protocol resulted in generally higher detection probabilities than standardized searches for these groups. Median detection probabilities resulting from both survey methods for canopy/subcanopy species were low. Results are approximate because the foraging stratum guilds differ in number of species (Fig. 4).  Table 1.

Survey completeness
The survey method used to obtain the complete acoustic data set detected on average 88% of the estimated total number of species per site ( Table 2). The completeness levels of the acoustic survey protocols that were developed by subsampling the complete data set increased steadily as more samples were added, but gains in completeness were small for large increases in sampling effort. For example, attainment of 73% completeness required an additional 50% of the effort required to obtain 69% completeness (Table 2). For the pairs of protocols using six or eight samples, more species were detected on average with the protocol versions that used fewer samples on more days.

DISCUSSION
We have demonstrated how two different approaches to developing an efficient and reliable acoustic survey protocol may be applied for studies intending to use passive acoustic recording to survey forest birds. Multispecies occupancy modeling and the assessment of survey completeness are two different frameworks, but they each have strengths and advantages that may better suit particular research aims. We have provided examples of the application of both frameworks in the context of dry sclerophyll Table 1. Species detected using the acoustic survey protocol and/or standardized searches (n = 48) are listed with their corresponding species codes. The proportion of times that posterior detection probabilities were greater for the acoustic protocol compared to standardized searches is provided for each species (Acoustic pd > SS pd). Species membership of foraging stratum guild categories are shown (Marchant and Higgins 1994, Higgins and Davies 1996, Higgins 1999, Higgins et al. 2001, 2006, Higgins and Peter 2002 Jacky Winter JWNT 0 All † Guild category abbreviations: All (all strata), G/U (ground/understorey), C/S (canopy/subcanopy), A/G (aerial/ground). forests of southeastern Australia, but either may be used as a template in many other global forest types. Furthermore, both frameworks are flexible; for example, a point count method could be substituted for area searches to compare species or guild detection probabilities with those obtained using a potential acoustic protocol.

Multispecies occupancy modeling
Detection probabilities for 73% of species recorded using these methods were greater with the acoustic survey protocol than the standardized search (Table 1). Many of these species had a relatively high probability of detection (Fig. 3), so the acoustic protocol was a better survey method for those species that are more readily detected in these forests. In mainly forested habitats of California, surveys using acoustic recordings also resulted in higher average detection probabilities than point counts conducted by an observer (Furnas and McGrann 2018). Detection probability for species that were recorded only once, or a few times, were typically low, regardless of survey method (Fig.  3, Appendix 2). Table 2. Survey completeness levels for methods using acoustic recordings. In the first row, the method used to acquire the complete acoustic data set is shown. The complete data set was subsampled to produce the following candidate set of nine potentially useful survey protocols for evaluation. There are several factors that would have influenced the variation in species detection probabilities between the methods. It is likely that the capacity to replay and review more complex sound files through manual processing contributed to the higher detection probabilities obtained with the acoustic protocol (Shonfield and Bayne 2017). The acoustic protocol was restricted to the first five 20-min samples following dawn, which was when more species were detected by their call per sample (Fig. 2). However, standardized searches were conducted up to 180 min following dawn, and so included a less productive period, which may partly explain some of the lower detection probabilities obtained by this method. Some of the differences in results obtained by standardized searches and the acoustic protocol were probably due to the fact that these methods sampled different parts of each site. The capacity to detect bird calls over distance is not the same for human observers and acoustic recorders, which was likely to have contributed to variation in results , Yip et al. 2017). Records of species heard calling outside the 2-ha area during standardized searches were excluded from analysis, but acoustic recorders would have been sampling an area larger than 2-ha for species that have the capacity to transmit their calls over relatively long distances. However, the inclusion of such species in the standardized search dataset had a negligible effect on detection probabilities (Appendix 3). Nevertheless, we recommend that if possible, recorders are located several hundred meters inside site boundaries.
The capacity of the observer to sight birds resulted in standardized searches being superior to the acoustic protocol in detecting three species from the family Petroicidae. The Rose Robin, Jacky Winter, and Eastern Yellow Robin were either only detected visually or were seen but seldom heard in the study. In Tasmanian forest, the Dusky Robin (Melanodryas vittata) was at times only seen by the observer and was one of the few species heard and seen by the observer in 100 m radius point counts that was not detected by the acoustic recorder (Hingston et al. 2018). In the same study, the Flame Robin (Petroica phoenicea) had also been recorded as only seen, and it was found that the human observer was better than acoustic recorders at detecting their calls. There can be considerable seasonal and diurnal variation in the extent to which the Eastern Yellow Robin and Jacky Winter vocalize (Keast 1994). For example, peak vocalizations may occur at the start of the breeding season, and/or during the predawn period, neither of which were sampled in our surveys. Future work using autonomous recording units should include sampling in periods of peak vocalization of all species of interest when establishing the timing of deployments and recording schedules.
Detection probabilities of foraging stratum guilds were generally greater for the acoustic protocol than standardized searches. There was one exception with the acoustic protocol, in that the median probability of detecting species that mainly forage in the forest canopy or subcanopy was ~25% less than that of species that use lower strata, or those that use all strata. Recorders were placed at 2 m above the ground, so species that use the upper strata of the forest may have been more difficult to detect simply because they were further away from the recorder than species using the lower strata. This result could also be due to the characteristics of the typical calls made by the species in this particular canopy/subcanopy guild (Table 1). Further investigation of potential bias against canopy species in recorded acoustic surveys would be useful.
The 20-min area search has been widely used as a bird survey method in Australia (Watson 2004), so we used 20 min for acoustic samples, to best compare the methods. Twenty-min periods have also been used effectively in other research that compared aspects of observer-based and acoustic recording methods (Darras et al. 2018b). However, experimentation with reducing the length of acoustic samples while varying the temporal arrangement of samples could be a worthwhile area for further investigation, because increased survey efficiency might be achieved (e.g., La andNudds 2016, Cook andHartley 2018).

Survey completeness
The survey completeness framework relies upon robust estimates of the total number of species. Some estimators of total species, such as ICE (Chazdon et al. 1998, Chao et al. 2000, have been shown to provide reliable estimates with limited numbers of samples (La and Nudds 2016). However, extensive sampling in a subset of sites prior to conducting the main study should be carried out, to not only test methods (Watson 2017), but also to enable the behavior of the selected estimator to be assessed in a particular forest community. In the present study, the complete acoustic data set was obtained by processing 900 min/site of recordings and resulted in an average of 88% survey completeness. This level of completeness may be close to the upper limit of what is achievable in some habitats, regardless of effort (Watson 2010). For example, on Barro Colorado Island, 192 hours of sampling a diverse bird assemblage resulted in 91% completeness, but adding 210 hours of results obtained by complementary methods gave less than a 1% increase in completeness (Watson 2010).
Gains in completeness were small and accrued at a reduced rate as sampling effort was incrementally increased in the set of candidate acoustic survey protocols ( Table 2). The level of survey completeness among the candidate set of protocols commenced at 62%, requiring six 20-min samples. An identical level of completeness was obtained for the same amount of total sampling time in woodland and open forest, but in that study, shorter (1min) recorded acoustic samples were taken randomly over five days from the postdawn period (Wimmer et al. 2013). The highest level of survey completeness among the acoustic protocols was 73%, which required 300 min of recordings. This represents a large amount of survey effort, given that ~78% of 194 reviewed avian studies conducted between 2004 and 2016, sampled for a total of 240 min or less per site survey (Watson 2017). At 73%, the completeness level of this protocol is modest, but it is only 15% less than that obtained for the complete data set, for a third of the effort (Table 2).
In the survey completeness framework, a protocol selected for use should represent the most efficient way of attaining the required level of completeness, which will be established by the aims of a study. For example, the acoustic protocol we devised that used the first five 20-min samples following dawn for two days had an average survey completeness of 69%. This protocol detected all the common species recorded in the study, as well as several rare species (Appendix 2). The nine species from the complete acoustic data set that were not detected by this protocol were recorded in very few sampling periods overall: 1/450 (one species), 2/450 (four species), 3/450 (two species), 4/450 (one species), and 13/450 (one species). Using results for the more common species can be sufficient to evaluate the effects of temporal or spatial environmental variation on assemblages (Lennon et al. 2004, Callaghan et al. 2017. Furthermore, investigation of the reasons why commonly occurring species are absent from an area is an effective approach to understanding drivers of species richness patterns (Lennon et al. 2004).

CONCLUSION
Researchers and natural area managers require cost-effective tools to monitor and assess diversity under increasing climatic and anthropogenic pressures on natural systems (Watson et al. 2014). Using acoustic recordings to survey forest birds offers several advantages over observer-based methods (Darras et al. 2019), but frameworks that users can adopt to design their own context-specific acoustic protocol have had limited application. We have demonstrated how two different frameworks can be applied to test and confirm the adequacy of potential survey protocols.
With the multispecies occupancy modeling framework, the acoustic protocol resulted in higher detection probabilities than standardized searches for most species, which established its adequacy for particular research questions. This framework enables the researcher to make decisions about how useful a protocol would be for individual species or guilds. The use of our occupancy modeling framework means that at least initially, a single acoustic protocol can be tested, which minimizes the volume of acoustic sound files that need to be manually processed. With the completeness framework, it was not necessary to conduct observer-based surveys, but manual processing of a substantial quantity of recordings was carried out to enable reliable estimation of the total number of species present. The range of average completeness levels resulting from the candidate set of acoustic protocols provided a basis for the selection of a protocol that could suit a given study aim. Ultimately, the decision to adopt one of the frameworks over the other will be based on the nature of the proposed project and the available resources.
Responses to this article can be read online at: http://www.ace-eco.org/issues/responses.php/1521