Nearmap AI Tree Canopy Boundaries, Vale Park, early 2018
In the two previous posts, we’ve detailed a 10-year study of Adelaide’s changing tree canopy, from 2011 to 2021. How green is your city? What data sets are available?
One of the few publicly accessible data sets for tree canopy in Adelaide is a LiDAR study performed across 2018/2019. Specifically, it blends data from two surveys, in April 2018 and October 2019 (18 months apart) into a single data set, and forms the baseline for what we understand will be future analysis. It uses LiDAR classification to map the extent of tree canopy >3m in height.
Below, we show a set of comparisons between the Nearmap AI tree vectors from January–March 2018, which form the foundation of one of the nine individual analysis dates we used in our study, and should be a relatively good temporal match for the data available in the Urban Heat and Tree Mapping Viewer. Visual comparisons of Nearmap AI and LiDAR reveal comparable results
High-resolution screenshots of the LiDAR data were spatially-matched to Nearmap data in QGIS using keypoints, so that the reader can flick between the two.
NB: The backing imagery used for the separate LiDAR study is more recent (appears to be ~2021), and lower-resolution and should be ignored for that reason.
While only a qualitative comparison is possible with visual inspection, we suggest there are four things to look out for:
Systematic differences: Does one data set consistently choose a larger or smaller boundary around individual trees, or stands of trees? A sub-question that often comes up is that Nearmap AI medium/high vegetation layer is officially defined as trees greater than 2m in height, whereas data sets such as the LiDAR study often standardise to 3m. Does this make a practical difference?
Artefacts: Are there other unusual aspects to the data that don’t appear to cause systematic differences?
False positives: What has been picked up as a tree by either data set, that should not be?
False negatives: Which trees, or groups of trees were missed by either data set?
You may wonder: “were these locations picked to show Nearmap data in a favourable light”? The answer here is ‘no’. I chose one location with very dense tree cover, one typical suburban tree cover, and one with some smaller, more intricately structured patterns of suburban trees. I encourage you to browse the Mapping Viewer to look at the LiDAR data, and the many examples of Nearmap AI tree data online and reported in media (or contact us to request a demo). My best judgement is that these findings would be consistent in any set of examples from this Adelaide survey. The main bias is that the comparison of a decade of Nearmap AI data is compared with one single LiDAR capture. Different companies with different sensors and processing systems may also arrive at different results. Example 1: Flinders Park
Applying the qualitative comparison criteria above, we can observe:
1. Systematic differences: For both the smaller trees and larger clumps of tree cover, it appears that the boundary area is similar enough that they fall within ‘visual tolerance’. It is likely there is some systematic difference, as any two different methodologies will have, but it is small enough to require proper quantitative analysis to detect. Specifically addressing the 2m vs 3m definition question, this does not seem to be an issue. With a range of tree heights, there are only two or three small trees that Nearmap includes, that the LiDAR excludes. By contrast, there are perhaps five or six small trees that the LiDAR includes, but Nearmap AI ignores. If the definitional difference of 2m for Nearmap was key, one would expect this to be the other way around (Nearmap including small trees in the 2-3m height range that are rejected by the LiDAR data). This reversed result implies that the methodological differences between the two approaches (deep learning on imagery vs laser reflectance) are more important for which trees are included, than the subtle definitional distinction between a 2m or 3m minimum tree height.
2. Artefacts: LiDAR starts as a point cloud, but then requires subsequent processing to produce a vector map on which to compute tree canopy cover. The documentation linked from the Urban Heat and Tree Mapping Viewer describes it as 8 points per square metre, and processed to a 1 by 1 metre grid. By contrast, Nearmap AI data is fundamentally computed at 7.5cm/pixel (roughly speaking 170 dots per square metre), with vectorisation and smoothing applied in post-processing. This results in the somewhat jagged appearance of the LiDAR in 1x1m grid cells, compared to smoother Nearmap AI outlines. Further, because the Nearmap AI model uses deep-learning to identify tree and other classes by simultaneously considering all image pixels in a large context area, it does not exhibit the same small holes and patchiness apparent in LiDAR processed data. That said, both of these issues are largely aesthetic, and are unlikely to impact a measure such as suburb-level (or even mesh block) tree canopy cover.
3 & 4. False positives/negatives: Neither image appears to include significant false positives or negatives, with the exception that some smaller trees, potentially only a few metres tall, show a level of disagreement in the data set. While “boots on the ground” verification could clear this up, the difference seems insignificant for tree canopy analysis. One would then have to consider whether, for example, a single branch poking above the rest is sufficient to classify as tree or not tree.