Archaeological Ground Point Filtering of Airborne Laser Scan Derived Point-Clouds in a Difficult Mediterranean Environment

Digital terrain models (DTM) based on airborne laser scanning (ALS) are an important source for identifying and monitoring archaeological sites and landscapes. However, a DTM is only one of many representations of a given surface. Its accuracy and quality must conform to its purpose and are a result of several considerations and decisions along the processing chain. One of the most important factors of ALS-based DTM generation is ground point filtering, i.e., the classification of the acquired point-cloud into terrain and off-terrain points. Filtering is not straightforward. The resulting DTM is usually a compromise that might show the surface below very dense vegetation while losing detail in other areas. In this paper, we show that in very complex situations (e.g., strongly varying vegetation cover), an optimal compromise is difficult to achieve, and more than one filter with different settings adapted to the varying degree of vegetation cover is necessary. For practical reasons, the results need to be combined into a single DTM. This is demonstrated using the case study of a Mediterranean landscape in Croatia, which consists of open areas (agricultural and grassland), olive plantations, as well as extremely dense and evergreen macchia vegetation. The results are the first step toward an adaptive ground point filtering strategy that might be useful far beyond the field of archaeology.


Introduction
Within the last one and a half decades, airborne laser scanning (ALS) has become a widely used technology in archaeology. This is due to the possibility of deriving detailed digital terrain models (DTM -please note that we are following a Eurocentric diction) even under dense vegetation and shallow water, the growing availability of countrywide ALS-derived DTMs (at least in Europe), and its seeming ease of use. A simple hill-shaded representation of the DTM is almost self-explanatory, even allowing archaeologists untrained in remote sensing to identify previously unknown archaeological and palaeo-environmental features. As a result, ALS has become a fixed component of integrated prospection approaches and has dramatically changed our understanding of archaeological sites, monuments, and landscapes, especially in wooded areas.
The downside of this alleged ease of use is a certain ignorance of the underlying modelling and data processing strategies combined with a naivety of expectations, followed by an incorrect use that might lead to disappointment. Knowledge about its basic principles and workflow is, therefore, a prerequisite for assessment of the suitability of ALS-based datasets for a certain purpose. Consequently, the only eligible countermeasure to this scenario is training and information about the whole process of ALS, from project planning to final visualization.
The basic steps during an ALS workflow are (Crutchley, 2010;Doneus & Briese, 2011;Opitz, 2013;Fernandez-Diaz et al., 2014;Shan & Toth, 2018): • Project planning (including defining the purpose of the scan), • system calibration (determination of lever arms, boresight angles, range and scan angle offsets, and scales), • data acquisition (instrumentation and settings, timeframe and flight mission parameters), • geo-referencing (direct geo-referencing of scan data by combining scanner range and deflection angle measurements and trajectory data derived from GNSS and inertial measurement devices), • refraction correction when employing water-penetrating green lasers, • flight strip adjustment, • classification/ground point filtering (although today, classification (and recently 'semantic labelling') seems Doneus, M, et al. 2020. Archaeological Ground Point Filtering of Airborne Laser Scan Derived Point-Clouds in a Difficult Mediterranean Environment. Journal of Computer Applications in Archaeology, 3(1), pp. 92-108. DOI: https://doi.org/10. 5334/jcaa.44 to be the more widely used term, we will be mainly using the term ground point filtering throughout this paper. While ' classification' is the assignment of points to different classes, 'ground point filtering' is the removal of all points from the georeferenced point-cloud that do not contribute to an archaeologically relevant DTM), • DTM interpolation, and finally, • visualization.
All these workflow steps require considerations and decisions that will affect the suitability of the resulting DTM for a certain purpose.
For archaeology, the most critical part of the workflow is ground point filtering -the removal of all points from the geometrically calibrated, georeferenced point-cloud that do not contribute to an archaeologically relevant DTM (see next section). It is comparatively rare that archaeologists will have the possibility to independently generate a DTM from ALS data, a process requiring access to the unfiltered point-cloud, classification software, and respective data processing expertise. Consequently, ground point filtering is a blackbox in (not only) archaeological applications. ALS data providers usually do not report on software and settings used for classification. In these cases, even when working with project-based data that were specifically acquired for archaeological purposes, reproducibility of the provided DTM is not possible and -even worse -its archaeological value is difficult to assess. This makes it even more important to understand constraints and pitfalls that come along with ground point filtering in order to evaluate the archaeological suitability of an ALS-based DTM.
Still, little has been published on this topic (Doneus & Briese, 2006;Cifani et al., 2007;Crow et al., 2007;Lasaponara & Masini, 2009;Heinzel & Sittler, 2010;Lasaponara et al., 2011;Lugmayr, 2013;Opitz & Nuninger, 2014). To the best of our knowledge, no information can be found on archaeology-oriented ALS-point cloud classification of extremely difficult situations, including varying topography with steep slopes and disparity of vegetation density. These are situations that usually cannot be filtered with a single parameter set. For geomorphological applications, terrestrial laser scans of complex scenes have been classified using a multi-scale dimensionality analysis (Brodu & Lague, 2012). This paper investigates a suitable archaeological ground point filtering of ALS-derived point-clouds for challenging settings, as described above. After presenting some important facts about DTM generation in Section 2, the case study and datasets are introduced in Sections 3 and 4. In Section 5, the filters used are presented. Sections 6 and 7, finally, present and discuss the results, including the propagation of an adaptive filtering strategy combining multiple filtered DTMs.

Ground point filtering of ALS derived pointclouds
As mentioned in the introduction, a crucial step of archaeological DTM generation is filtering the point cloud into terrain and off-terrain points. Any DTM is a representa-tion of the terrain's surface (be it bare earth or including walls and buildings). There is no single DTM that could be regarded as displaying "the" true surface. This is already expressed by the term "model," which is meant to be a simplified image of reality. This implies that one and the same ground surface can be modelled theoretically by an indefinite number of terrain models. It is, therefore, important to know that any DTM is one of many representations of a ground surface. The decision as to whether it is a good or appropriate model depends on the purpose of the terrain modelling, which influences its scale and in consequence, its detail.
It is important to assert that, while a DTM is sensu stricto a representation of the bare earth surface void of vegetation, buildings, cars and the like, an archaeologically relevant DTM may differ from this specification. When a DTM is used to identify archaeological features, a DTM also needs to be void of any vegetation, but buildings (e.g., ruined castles), standing stones, walls, roads, channels, earthworks and the like should survive any filtering and be represented in the final "DTM." Therefore, it might be rather called an "archaeological digital elevation model." Still, this term cannot be coined for all archaeological applications; for example, modelling palaeohydrology of a certain point in time might demand terrain models void of walls and standing stones from later periods.
For the purpose of mapping archaeological features, a suitable DTM should depict bare ground displaying archaeologically relevant micro-relief as mentioned above but at the same time excluding any non-archaeologically induced micro-relief (e.g., tree-stumps). There is usually no evidence available that would provide a proper indication of the "true" ground as well as quantity, appearance, and shape of the archaeologically relevant micro relief. Therefore, during the process of point-cloud classification, the question of whether a filtered DTM is appropriate can only be decided by the person carrying out the task of point cloud classification. In an ideal case, this would be an archaeologist or a domain expert working in close cooperation with the archaeological end-user.
Ground point filtering of point clouds for archaeology is not straightforward. There are a wide range of software packages available that have been reviewed in a small number of papers (Sithole & Vosselman, 2003;Lugmayr, 2013;Polat & Uysal, 2015;Forte & Campana, 2017). Depending on the software used, a greater or lesser number of parameters must be set in order to identify surface points for certain environmental settings. In any case, the creation of a DTM from an ALS-based point-cloud is a compromise between deleting unwanted points while at the same time keeping archaeological detail. This is, to a certain degree, achievable in areas with deciduous trees and sparse underwood in off-leave conditions (although walls could become truncated by the filtering). A dense vegetation cover, however, might prevent most laser pulses from reaching the ground, resulting in a high quantity of vegetation-points and only a few ground echoes. Filtering this kind of situation requires strong filter settings that might also remove archaeological detail in less densely vegetated areas of the dataset.
The most problematic settings are mixed situations with an extremely varying topography (e.g., a mixture of smooth and predominantly horizontal terrain, undulating terrain, and steep slopes) and disparity of vegetation density. These cases usually cannot be classified with a single parameter set. However, to our knowledge, there is no classification software available that could handle such difficult settings using adaptive parameters based on vegetation density and slope. In order to receive optimum results for such diverse areas, repeated classification with particular parameter sets for dense vegetation and more open areas are necessary, resulting in multiple DTMs. For practical reasons during visualization and interpretative mapping, the resulting DTMs should, however, be combined into a single terrain model. In the following, this approach is demonstrated using the case study of a Mediterranean landscape in Croatia.

Case study: The Mediterranean landscape of the Medulin Bay, Croatia
The northern Adriatic peninsula of Istria lies between the Gulf of Trieste, Italy, and the Kvarner Bay, next to Rijeka, Croatia. Along with the port of Pula and the bay of Raša, the Medulin Bay is one of the largest natural anchorages in Istria (Figure 1). It is situated at the southernmost tip of Istria, between Marlera and Kamenjak Cape with a total area of 22 km 2 . The water depth has an average of less than 8 m in the inner bay. A few small, uninhabited islands are scattered around the bay. In the west, the bay of Medulin is bordered by the peninsula of Premantura, in the north and east it is connected to the villages of Pomer and Medulin. The relief is represented by hills and limestone terraces with a maximum height of 80 m and steep slopes towards the coast in the southern part. The vegetation cover consists of open areas (agricultural and grassland), olive plantations, as well as extremely dense and evergreen macchia vegetation (Figure 2).
The region is mainly known for its numerous Roman sites (Koncani Uhač, 2008;Girardi-Jurkić, 2013), the most notable being the villa maritima of Vižula (Džin & Giradi Jurkić, 2008), situated in the municipality of Medulin. Remains of Roman architecture, partly submerged due to the rising sea level, have been documented here over a total length of 1.2 km. The site has been investigated through underwater and terrestrial excavations (Girardi Jurkić et al., 2012;Miholjek, 2012) since the mid-1990s, and via large-scale geophysical prospection from 2014. The EU project "Archaeological Park Vižula" (2017-2019), led by the Municipality of Medulin to preserve and present the site to the public, provided the acquisition of new data by means of ALS in 2018. The objective was to combine underwater and terrestrial research and to understand the relationship between the site and the past landscape.

Data
Approximately 24 km 2 of land and underwater terrain were scanned in the area of Medulin Bay, including Vižula and the peninsula Premantura (Figure 3). A topo-bathymetric laser scanner operating at a wavelength of 532 nm (i.e., visible green domain of the electromagnetic spectrum), which can penetrate clear water (see also Doneus et al., 2013), was used due to the fact that the Roman architecture is currently partly submerged (Vacchi et al., 2016). Data was acquired in March 2018 using a Riegl VQ-820-G topo-hydrographic airborne laser scanner (Steinbacher et al., 2012) operated by Airborne Technologies GmbH in calm water conditions. The entire area was covered in two blocks, with 29 flight strips overlapping at least by 50%. The scan settings are listed in Table 1. Additionally, a medium format digital RGB camera IGI DigiCam-H/39 (equipped with a Hasselblad lens HC 3.5/50) simultaneously documented the area with a ground sampling distance of 7 cm. Each of the scanned strips has an average laser pulse density of 6 shots/m 2 . As the strips have overlaps between 50 and 60%, the laser pulse density has an overall mean of 12 shots/m 2 . This number is, however, not uniform as it varies depending on the number of strips overlapping in an area. Also, areas of deeper water did not return any echoes, and in that way, reducing the average laser pulse density (see Figure 3b). Within a more typical project region (on land with mainly 2-3-fold strip overlap and a mix of open area and high vegetation -as in Figures 6 and 7), the mean pulse density is 15 points/m 2 (see Table 1).
Due to the diverse vegetation, the classification resulted in an uneven distribution of ground points, which will be detailed in the following sections. Overall, the average of classified ground points is 11 points/m 2 . The ratio of laser pulse density and resulting ground points typically lies between 1 (in open areas and shallow water) and 4 (dense Macchia). In the region of dense Macchia from Figures 2b and 10, the average pulse density is as high as 21 (the high number is due to a cross-strip flown over this area), while the average ground point density is only 7.5 points/m 2 .
Data pre-processing was carried out by the lead author using the software package OPALS (Mandlburger et al., 2009;Pfeifer et al., 2014) and comprised the following steps: 1. Echo detection: this was achieved using the scanner's online waveform processing capability ( Pfennigbauer & Ullrich, 2010;Pfennigbauer et al., 2014) and the scanner manufacturer's software RiProcess. 2. Derivation of a water surface model. Due to the calm water conditions and short acquisition time, modelling in the air-water-interface with a single horizontal surface at the water level was considered sufficient. Minor influences from residual waves and tidal differences were ignored. 3. Georeferencing of each scanned strip by first combining the synchronously captured flight trajectory (GNSS/IMU) and scanner measurements (raw range and scan angle), and by subsequently performing range and refraction correction of the water echoes based on the water surface model (Snell's law). 4. Strip adjustment of georeferenced point cloud including quality control of the results (Ressl et al., 2008;Ressl et al., 2011).
The application of these four processing steps resulted in a dataset of 29 georeferenced, refraction-corrected, and adjusted point clouds in ASPRS LAS format, each representing one of the original strips.

Ground point filtering methodology
Strip adjustment is usually followed by the classification, which uses appropriate algorithms to classify the geometrically calibrated point cloud amongst others into ground points, clutter points (low and high points), low, medium, and high vegetation. This process is also known as semantic labelling of the dataset. In our case, the aim was to derive an archaeological digital elevation model as specified in Section 2. The software package SCOP++ was employed, which uses robust hierarchic interpolation with an eccentric and asymmetrical weight function (Kraus & Pfeifer, 1998 Table 2). Each filter is composed of 13 steps arranged in five groups. Each of the first three groups contains:  Thinning method mean k-th lowest (4) Step  • A ThinOut step, where the point cloud is thinned out using a raster with user-defined cell size. For each cell, one output point is chosen using a specified parameter (for a DTM, one would usually choose lowest, k-th lowest, or mean). • A Filter step, applying a robust filter with an eccentric and asymmetric weight function to the thinned point cloud: First, an initial surface is computed using linear prediction (equivalent to ordinary Kriging) and equally weighted points. Then, each point is assigned an individual weight based on the distance from the initial surface using a user-defined weight function. Points below the computed surface receive a higher weight than points lying above the surface (= unsymmetrical weight function -see Figure 4). After that, a new surface is calculated based on the applied weights. The procedure is iterated until a certain threshold is reached. • A SortOut step, which calculates the z-difference of each point of the original point cloud and the resulting surface from the previous filter step. Points located outside a user-specified interval around the filtered surface are disregarded for the subsequent iteration round.
This sequence of ThinOut -Filter -SortOut is repeated within the first three groups with consecutive finer settings (see Table 2). In the fourth group, ThinOut and Filter are followed by an interpolation of the resulting DTM. Finally, a threshold-based classification is applied that allocates all original points into high vegetation, medium vegetation, low vegetation, ground, and points below DTM. The most important parameters of both filter strategies are listed in Table 2. Due to many clutter points both below and above the actual terrain, the thinning method had to be set to "mean" in the first group (step 1). Such clutter points are typical for any (topo-) bathymetric scanners employing highly sensitive receivers to detect very weak echoes from the benthic layer (see Figure 5). This led to a surface that would be close to the actual terrain in open areas, at the same time being above the terrain in vegetated areas. While the most excessive clutters had been removed by the first SortOut (step 3), clutter points closer to the actual terrain surface were still present. To force the surface down towards the actual terrain in vegetated areas, the thinning method is subsequently set to "k-th lowest" in steps 4 and 7 for the "dense vegetation" parameter set.

Result
The final DTMs differ, each displaying varying degrees of success in representing archaeologically relevant terrain, as demonstrated in Figure 6, which depicts a section of a coastal area. The area is characterized by small parts of open land alternating with dense macchia. The lower half of the image is dominated by the clear waters of the Mediterranean Sea, in which the green laser was able to penetrate down to 8 m depth.
The hillshade in Figure 6a is based on the ground point filtering that resulted from the "dense vegetation" parameter set. Here, the vegetation was removed. Consequently, the area in the upper half shows the terrain, which includes archaeologically relevant structures (a collapsed and eroded dry-stone wall - Figure 6a/1). However, the open area was not modelled correctly. In particular, the terrain of the seabed displays an abundance of seemingly pit-like structures. These must be regarded as artificial features resulting from clutter points below the seabed surface that could not be completely removed due to the filter settings.
The hillshaded terrain model in the lower image (Figure 6b) is based on the "open area" parameter-set. It exhibits a well-filtered surface in the non-vegetated areas and under shallow water. Without the disturbing pit-like structures, the hillshade now even shows traces of  submerged walls (Figure 6b/2). However, in this case, the dense vegetation was not filtered properly. As a result, the area in the upper half of the image is largely disturbed. The imperfectly filtered vegetation produces a seemingly raised "terrain" that obscures the actual ground surface and all the archaeologically relevant dry-stone walls.

Merging DTMs
The comparison of both datasets indicates that a single filter setting would not have been a useful approach, as it would have ended in a less than optimal compromise. When comparing both filtered DTMs, it becomes apparent that the well-filtered areas are exclusive in each DTM. Therefore, in order to obtain a single terrain model that would be entirely useful for our archaeological purpose, the most suitable solution was to merge both DTMs.
To identify a ruleset for merging both datasets into a single, archaeologically optimal DTM, both filtered models were analysed together with the orthophotographs and point density maps of the classified ground and vegetation points. During analysis, it became evident that there were two key factors behind an archaeological ground point filtering strategy: (1) Vegetation density was the key factor that made two different filter-sets necessary. After visual inspection of the open area DTM with the overlaid vegetation point density map, empirical testing showed that the "open area" parameter set failed to remove vegetation if the vegetation point densities (counting all echoes) exceeded 6 points/m 2 (Figure 7). (2) On some occasions, underwater areas were falsely classified as vegetation during classification (Figure 8). and "open area" ground point filtering strategies in a single DTM (Figure 9 compared to Figure 6).

Discussion
The merged DTM represents a surface that is an optimal compromise of two ground point filtering processes adapted to different environmental settings: it retains micro-topographic detail and displays an archaeologically interpretable surface, even below areas of dense vegetation (Figure 9). In order to merge DTMs based on a point density layer, the underlying point density maps should be derived for cell sizes of at least 5 m edge length or smoothed with a similar kernel size. The reason for this is that densely vegetated areas often expose small areas of 1 × 1 or 2 × 2 m 2 containing little or no vegetation. Using point density raster maps with a smaller cell size would lead to the effect that, while the cells of densely vegetated areas receive height values from the "dense vegetation"   DTM, some smaller less vegetated patches would receive height values from the "open area" DTM, causing substantial differences (perhaps even a few meters above ground). Consequently, the small patches receive "incorrect" heights (by roughly 0.5 to 3 m) from the surrounding cells, resulting in a combined DTM that is not useable. The quality of the ground point filtering is also demonstrated in Figure 10, which displays an area with extremely dense vegetation (corresponding to the on-site photograph displayed in Figure 2b). Still, the filter could classify enough ground points, so that past field boundaries (visible today only as shallow earthworks) become evident in the resulting hillshade. At the same time, detailed topographic features such as the submerged wall (Figure 10/1) are preserved in the merged DTM. Stillas mentioned in Section 2 -we cannot ascertain that the ground was correctly classified in the entire area. Other, maybe less pronounced or non-linear archaeological features still might have been filtered out. Note that the holes in the DTM denote areas where ground points could not be classified for coherent gridding at 0.5 m resolution.
While the merged DTM combines the archaeologically relevant terrain information from (in this case) two filter settings, it still represents only a model of the real-world-terrain. Although it can be regarded as an optimized version, accuracy and archaeological detail will still largely depend on the quality of the filtered input, which is affected by the original data (data acquisition, state of vegetation, quality of georeferencing, type and nature of recent and archaeological features) and on the used classification software and its settings. This is exemplified in Figure 11, which displays the remains of Roman buildings submerged 2 m below the water level. The left image is noisy and shows less detail as a result of missing refraction correction and consequently, sub-optimal strip adjustment. On the right side, the same scene has become much clearer after the correction of refraction and subsequent strip-adjustment.
It is important to note that the presented method was tested in a relatively small area of 24 km 2 comprising a known variety of environmental settings, where the merging ruleset could be adjusted to specific and known parameters. Also, the results could be checked and verified using orthophotographs and on-site visits. Threshold-values were determined through empirical observation. This may be unsuitable for large project (Canuto et al., 2018;Evans et al., 2013) or country-wide datasets containing a wide and unknown variety of environmental settings (Cowley et al., in prep; Fernandez-Diaz and Cohen, in press).
Here, more research needs to be done towards automated adaptive ground point filtering strategies. While the presented approach is simple in the sense that it is based on established and proven standard classification techniques built around clear geometric reasoning, the current trend in semantic labelling of 3D point clouds is indisputably focussed on unsupervised machine learning techniques with a clear trend towards deep learning. Popular techniques like convolutional neural networks (CNNs) outperform most traditional classification techniques, especially in the area of pattern recognition. In the context of classification of 3D ALS point clouds, grid and voxel-based approaches based on 2D-and 3D-CNNs (Zhao et al., 2018;Schmohl & Sörgel, 2019), as well as single point based networks (Winiwarter et al., 2019), are used. While all these approaches have already proven their competitiveness compared to standard classification techniques, they still need abundant training data. In contrast, ready-to-use DTMs for archaeological mapping purposes can be easily achieved with well-defined and easily comprehensible modifications to existing techniques using the approach presented in this paper.
Nevertheless, for smaller and known project areas, merging various DTMs derived from adapted ground point filters is a viable method to combine the advantage of each into a single terrain model. In this rather simple example, different filter settings were applied due to a difference in vegetation cover. Other scenarios might include varying filter settings induced by other factors, such as topography, tree species, or buildings. Accordingly, merging rules would differ and might include information from slopemaps or thematic maps.
The implications of this approach might be important for rethinking archaeological use of ALS-based datasets. Applying adapted ground point filters and defining rules for merging the results demand to be more explicit on filtering strategies. Consequently, the blackbox of ALS-based DTM generation will open with an improved evaluation of the archaeological usefulness of the resulting datasets. Publishing filter strategies are the only way to guarantee the reproducibility of results, at the same time reducing archaeologist's dependence on serendipity.
Normally, archaeologists will not have the skills and specialised software to process ALS data on their own. Still, to make the best use of such data for archaeological prospection, knowledge (in ideal cases), control of data acquisition, and the processing workflow are of crucial importance. When working with project-based data that were specifically acquired for archaeological purposes, georeferencing and ground point filtering of the dataset should not be left to the data providers without supervision (unless they are specialized in laser scanning for archaeological purposes). Maintaining a close contact, including feedback loops, is necessary for a successful archaeological DTM generation. Also, it is recommended to demand for the raw data, i.e. at least the strip-wise recorded echoes from the scanner, the trajectory file(s) and information on scanner-GNSS offsets, as this data is not delivered by default (see also Payne, 2009: chapter 3.1). As methodological and software developments are constantly improved, re-processing and re-filtering might become desirable later. This will, however, require access to the abovementioned raw data (see Fernandez-Diaz and Cohen, in press). For general purpose data, unfiltered point clouds may be available, and it might be worthwhile to invest time and expenditure into a better ground point filtering that is more archaeologically relevant. Even if this is not possible, knowledge about metadata will help to evaluate the archaeological potential of a given general-purpose dataset and therefore will assure a reasonable and successful application of ALS for archaeology (Payne, 2009;Doneus & Briese, 2011;Opitz, 2013;Grussenmeyer et al., 2016;Boardman & Bryan, 2018).

Conclusion
Airborne laser scanning has great potential to reveal archaeological remains in a variety of contexts. This could be demonstrated in the area of the Medulin Bay on the southern extent of Istria, a typical Mediterranean environment with varying degrees of vegetation, including open areas as well as dense and mainly evergreen macchia. Testing various filters to remove the dense vegetation while keeping archaeological detail in less overgrown areas showed that there was no single filter setting that would achieve a satisfying result. As an optimal ground point filtering compromise was unfeasible, two different filters were applied for the same dataset. These resulted in two DTMs that were subsequently merged by rule-based raster calculations and can be regarded as a first step towards an adaptive filtering strategy that might be useful far beyond the field of archaeology.
The merged DTM represents an optimal compromise of two ground point filtering processes adapted to different environmental settings: it keeps micro-topographic detail and displays an archaeologically interpretable surface even below dense vegetation. Therefore, being more explicit on ground point filtering should become best practice (at least for projects which rely on commissioned project-data), as it will enable reproducibility of processing results and make the current blackbox more transparent. This study also stresses the importance of raw data management.