Pointcloud Processing Tooling

For some of the more recent LIDAR maps I’ve been producing over the past six months - see this post with initial experimentation, and the resulting full-sized maps I’ve rendered so far can be found here - the source data has been in Pointcloud form rather than as a ‘flat’ raster image format (GeoTIFF) of just the DSM height values, and so it requires different processing in order to clean it up, and then convert it into a ‘displacement’ image map so I can render it as a 3D geometry mesh.

The below rendering is of Sydney, and uses a conversion from Pointcloud source data (available from https://elevation.fsdf.org.au/ as .laz files):

Sydney DSM map LIDAR render

The data being in Pointcloud format has both advantages and disadvantages over more simple raster height images: one of the main advantages is that there’s often (depending on the density and distribution of the points) more data per 2D area measure of ground, and each point has separate positions and attributes, rather than just being an average height value for the pixel as it is in raster image format. This means that theoretically it’s easier to “clean up” the data and remove “bad” points without destroying surrounding data. Another advantage is that being fully-3D, it’s possible to view the points from any direction, which is very useful when visualising and when looking for outliers and other “bad” points.

The disadvantages are that the data is generally a lot heavier and takes up more memory when processing it - three 8-byte double/f64 values need to be stored for the XYZ co-ordinates - at least when read from LAS files, as well as additional metadata per point (although there are ways to losslessly compress the memory usage a bit) - in addition to needing different tools to process than with raster images. Newer QGIS versions do now support opening and viewing .LAS/.LAZ Pointclouds (both in 2D and 3D views), although on Linux I’ve found the 3D view quite unstable, and other than being able to select points to view the classification, there’s not much else you can do to process the points, other than some generic processing which uses the PDAL tooling under-the-hood. It also appears QGIS has to convert the .LAS/.LAZ formats to an intermediate format first, which slows down iteration time when using other processing tooling side-by-side.

PDAL is a library for translating and manipulating Pointcloud data (similar to GDAL for raster data / GeoTIFFs, which QGIS uses for a lot of raster and vector operations under-the-hood), and it has quite a few useful features including merging Pointclouds (a lot of the source DSM Pointcloud data is only available as tiles of areas, and so the data needs to be merged to render entire cities, either before converting to a raster displacement map or after), filtering points, rejecting outliers and converting to raster image heightfields.

I have however found its memory usage somewhat ‘excessive’ for some operations, in addition to being slow (despite the fact it’s written in C++). Because of this - and also to learn more about Pointclouds and the file formats - I’ve started to write my own basic Pointcloud processing utility application (in Rust - the las-rs crate allowed out-of-the-box reading and writing of the .LAS/.LAZ Pointcloud formats which was very useful to get started quickly), which despite not really doing anything especially ‘fancy’ for some of the more simple operations like merging .LAS/.LAZ files - it just does a naive loop over all input files, filtering the points within based on configured filters and then accumulating and saving them to the output file - uses a lot less memory than PDAL does, and is quite a bit faster, so I’ve been able to process larger chunks of data with my own tooling and with faster iteration time.

The one area I haven’t tackled yet and am still using PDAL for is conversion to output raster image (GeoTIFF) - which I then use as the displacement map in my renders - however I hope to implement this rasterisation functionality myself at some point.

I am on the lookout for better Pointcloud visualisation software (in particular on Linux - a lot of the commercial software seems to be Windows or Mac only). QGIS’ functionality is adequate but not great, and is fairly lacking in terms of selection, and other open source software I’ve found like CloudCompare seem a bit unstable (at least when compiling from source on Linux), and it’s not clear how well it’d scale to displaying hundreds of millions of points at once.

I have found displaz which is pretty good for displaying very large Pointclouds (it progressively draws them, and seems to store them efficiently in memory), however it has no support for selection or manipulation of points (by design), so I’m still looking for something which caters to that additional need: in particular the selecting of outlier points interactively and culling them.



Basic Apple M2 Pro CPU Benchmarks

This month I bought and received a new Apple MacBook Pro (M2 Pro, 14-inch, 2023) with the aim of replacing my own Apple MacBook Pro 15-inch (Intel) 2015 model which I bought in 2016. The MacBook Pro 15 still does work, but the battery life is awful now (I could have it replaced, which I’ve done several times with laptops in the past), the internal fans barely work, and the rubber around the screen is disintegrating, so with a trip back to the Northern Hemisphere planned next month, I thought it was time for a replacement.

A year ago I was provided (on loan) a work-provided MacBook Pro 14 M1 Pro (see previous benchmarks) which I’ve been using a bit, so I knew mostly what to expect in terms of performance and from the laptop in general, but I was curious to compare the performance of the M2 Pro against the M1 Pro (and the old Intel machine).

It’s not going to be a completely fair apples-to-apples comparison, as the work-provided MacBook Pro 14 M1 Pro processor is the 10-core version which has two extra performance cores than the baseline did - having eight performance cores and two efficiency cores - and the M2 Pro I’ve just bought is the baseline model - with six performance cores and four efficiency cores - but it should provide a rough indication of what performance to expect.

The Xcode / Apple Clang compiler versions are also different: My MacBook Pro 15 (Intel) is still running quite an old MacOS version, with an older compiler which I don’t want to update, and while I did install Xcode 14.3 on the 2021 MacBook Pro M1 Pro (as well as the command line tools) in order to attempt to match what I’d just installed on my new 2023 MacBook Pro M2 Pro, clang --version still shows version 13.1.6, whereas my new 2023 M2 Pro MacBook Pro shows 14.0.3 being used, so I’m not really sure what’s going on there, as Xcode -> About Xcode shows Version 14.3.1 as I’d expect on both MacBook Pro 14 machines (and both have the command line tools for that version of Xcode installed). The MacBook Pro 14s are both running MacOS Ventura 13.4.

Tests:

Copying what I did in the test last year, I’ll be using two of my apps as benchmarks: my Mint interpreter language VM (originally based off Robert Nystrom’s excellent Crafting Interpreters Lox language tutorial) but with additional functionality and performance improvements, which I’ll use to benchmark two Mint scripts as single-threaded tests, and also my Imagine pathtracing renderer, which has native SSE intrinsics support for Intel and native Neon intrinsics support for ARM, which I’ll run in both single- and multi-threaded scenarios.

Both apps will be compiled with -march=native on the Intel side and -mcpu=native on the Apple Silicon / ARM side, using the clang version on the machine in question, as well as optimisation level: -O3.

The two Mint script tests will be loop value calculation as Test 1:

var a = 1;
for (var i = 0; i < 100000000; i += 1)
{
    a = (i + i + 3 * 2 + i + 1 - 0.42) / a;
}
print a;

and a variation of Project Euler 21 to calculate the sum of all Amicable numbers under 15,000 as Test 2.

The Imagine rendering tests will render three different scenes in both single- and multi-threaded mode.

Example Maze Lights scene

The first render will be the same maze scene with spherical area lights that I used in the test last year (example image above), but with different settings: resolution will be 256 x 256, 256 samples-per-pixel will be used, but this time only one next-event light sample will be taken each path vertex. The general ray traversal and ray intersection will utilise SIMD, but the (fairly expensive) perfect light sphere sampling is scalar, and very unlikely to be vectorised by the compilers themselves.

Example SDF Julia Fractal scene

The second render will be a 450 x 338 resolution render of a Signed Distance Field primitive of a Julia Fractal (example image above, although with different settings), which is quite expensive to evaluate, and also does not have any SIMD utilisation for the SDF evaluation / intersection. There’s a physical Sky IBL in the scene as a light, and 144 samples-per-pixel will be used, with 3x3 Blackman Harris pixel filtering being used.

Example instanced mesh cube scene

The third render will be a 450 x 338 resolution render of 2,326,299 instanced mesh cubes in the (pre-calculated) shape of a Julia Fractal, which will fully-utilise SIMD instructions for ray traversal in the BVH and for ray / primitive intersections. Again, there’s a physical Sky IBL in the scene, 144 samples-per-pixel will be used, but this time no pixel reconstruction filtering will be used (so effectively Box 1x1).

These tests will all be done on (close to fully-charged) battery power - I discovered in the tests last year that Apple doesn’t seem to down-clock on battery power - and I will also wait between test runs for the processor temperatures to be below 50 degC before running the next test, to try and reduce the impact of thermal throttling.

All tests will be run three times, and results below will show the mean average of those numbers.

Results

Single-threaded Mint VM interpreter:

Single-threaded Mint interpreter VM benchmarks, smaller values are better: Single-threaded Mint interpreter VM benchmarks

For these single-threaded tests, the M2 Pro has a small improvement over the M1 Pro’s performance, which itself is around 10-16% faster than the eight year old Intel i7 processor. As mentioned last year however, I think this is very likely because the Mint VM execution is often branch-prediction constrained within the main VM bytecode interpreter loop, so there’s a limit to the amount of Instruction-Level Parallelism that’s achievable from the eight-wide M1 Pro and M2 Pro.

Single-threaded Imagine rendering:

Single-threaded Imagine rendering benchmarks, smaller values are better: Single-threaded Imagine Render benchmarks

The single-threaded rendering tests, which have a lot more floating point calculations and SIMD usage, show a small performance improvement for the M2 Pro over the M1 Pro. Intriguingly, the Maze Lights scene is the test with the biggest performance increase (almost 2x faster) from the Intel machine to the Apple M1 Pro: the other tests show slightly smaller gains, which I wouldn’t have expected. Without further microbenchmarks of various isolated parts of those tests, it’s difficult to guess why that might be, but the different render tests do exercise different calculations and code paths.

Multi-threaded Imagine rendering:

Multi-threaded Imagine rendering benchmarks, smaller values are better: Multi-threaded Imagine Render benchmarks

The multi-threaded rendering tests show that due to the fact the M1 Pro machine has eight performance cores and two efficiency cores, whilst the M2 Pro machine only has six performance cores and four efficiency cores, the M2 Pro machine is only very slightly faster in the SDF Julia Fractal render scene than the M1 Pro machine, and ties in the other two tests. Once again, the Maze Lights scene shows the biggest performance increase - almost 4x faster - from the Intel CPU to the Apple Silicon ones, with the other two tests showing a less dramatic difference (between 2x to 3x faster).

Conclusion

I think a not too bad showing for the baseline model: the M2 Pro can beat the M1 Pro by a small margin in all single-threaded tests, and can either just about equal or slightly beat the M1 Pro which has more performance cores (but two less efficiency cores) in the multi-threaded tests.



LIDAR Digital Surface Model Map Renders

I’ve started experimenting with rendering representations of LIDAR-measured Digital Surface Model map datasets, which in contrast to Digital Elevation Model or Digital Terrain Model datasets - which are more common, and only consist of the raw terrain elevation data - have human-built structures in the height data (i.e. buildings). Previous Map Renderings (A full list of ones I’ve done so far can be found here) have involved DEM model data which just consists of the natural terrain, and in terms of scale, I’ve focused there on rendering entire countries or islands.

So I’ve been curious to try generating more detailed imagery of more localised areas, in particular of cities where human-made buildings and architecture are clearly visible, and here are some very early initial attempts with the raw DSM data for London.

Example London LIDAR DSM rendering The above render (Full 2.5k Render) is just a plain render of the surface being displaced by the LIDAR dataset values, using data from the UK’s Environment Agency from the LIDAR Composite DSM 2022 - 1m dataset showing central London.

(Note that when downloading the tilesets from the website, the category defaults to the ‘DTM’ Digital Terrain Model version which doesn’t have buildings, so ensure you switch to the ‘DSM’ version if that’s the one you want.)

When zoomed out and viewed from above, the fidelity seems pretty good: showing buildings, bridges and trees in nice detail. Using a slightly less boring shader look - an occlusion shader, driving a diffuse surface colour gradient - gives what I think is a quite pleasing effect, accentuating the streets: Example London LIDAR DSM occlusion rendering (Full 2.5k Render)

The height scale I’ve used is not physically-based to real scale (to the horizontal extents) currently: I’ve just eyeballed something which looks fairly reasonable, but I think it’s probably a bit too high still.

Once you start to look at the resulting generated surface in a bit more detail from closer up or at a more horizontal viewing angle, then the understandable limitations of the fidelity and format (2D single height values) of the data becomes a bit more apparent, especially in situations like overhangs, where single DEM/DSM values per point on a 2D surface are obviously limited. This can be seen in representations of bridges (there’s no gap underneath them), Tower Bridge and the London Eye in the below alternative view render: Close up London LIDAR DSM occlusion rendering

London did seem to previously have some 25 cm resolution DSM datasets available at some point, but data.gov.uk doesn’t seem to have those any more, although they may still be part of other overall datasets, so I’ll try and find them. That won’t solve the ‘overhang’ problem (you’d need to use a full 3D point-cloud representation for that), but it should provide extra detail which might be interesting.

I’d also like to try and colour in water areas (and maybe foliage-heavy areas like parks) specific colours to add a bit more contrast and produce some more “artistic” versions, which I’ll look into doing over the next few months. Looking at rendering other cities that have more hilly terrain than London might be interesting as well, in order to have a combination of terrain and human-built buildings. I did look to see if I could find LIDAR DSM models of Wellington, NZ (where I’m currently living), which has an interesting combination of the two (although most buildings are quite small on the hills here), but I could only find DEM terrain models (without structures) for NZ. Cities like San Francisco are likely to have good data in this category, and I’ll have a look at other cities as well to see what’s available.



Basic Apple M1 Pro CPU Benchmarks

Last week I received a work-provided (on loan) new Apple MacBook Pro (14-inch, 2021) with the Apple M1 Pro 10-core processor, so I was very curious to see how it performed with some of my code, given the generally excellent reviews and reports I’d seen by people with it online.

I have my own MacBook Pro 15-inch 2015 (Intel i7) model which I’ll be comparing it against as a basic benchmark performance test, although it won’t be a completely fair test as the two machines are running different versions of MacOS and the clang compiler, and I didn’t want to upgrade my own machine’s MacOS version to a newer version (for various reasons). So the code being executed on the different machines will be generated by different versions of clang, but given the micro-architecture is totally different anyway, I think it’s still a meaningful test and at least will show the performance difference between the two laptops.

My old MacBook Pro 15-inch 2015 is a 2.5 GHz Intel Core i7, with four cores, eight threads, and 1600 Mhz DDR3 memory, running macOS 10.14.5 (Mojave), with Xcode 10.2.1 installed which has clang version 10.0.1.

The new MacBook Pro 14-inch 2021 is the M1 Pro 10-core version, with 8 performance cores and 2 efficiency cores, running macOS 12.3 (Monterey), with Xcode 13.4.1 installed which has clang version 13.1.6.

Tests:

I’ll be using two of my apps as benchmarks: my Mint interpreter language VM (originally based off Robert Nystrom’s excellent Crafting Interpreters Lox language tutorial) but with additional functionality and performance improvements, which I’ll use to benchmark two Mint scripts as single-threaded tests, and also my Imagine pathtracing renderer, which has native SSE intrinsics support for Intel and native Neon intrinsics support for ARM, which I’ll run in both single- and multi-threaded scenarios.

Both apps will be compiled with -march=native on the Intel side and -mcpu=native on the Apple M1 Pro ARM side, using the clang version on the machine in question, as well as optimisation level: -O3.

The two Mint script tests will be loop value calculation as Test 1:

var a = 1;
for (var i = 0; i < 100000000; i += 1)
{
    a = (i + i + 3 * 2 + i + 1 - 0.42) / a;
}
print a;

and a variation of Project Euler 21 to calculate the sum of all Amicable numbers under 15,000 as Test 2.

The Imagine rendering tests will render the below example scene of a basic maze with 12 spherical area lights and one hemisphere light, with four Next Event Estimation samples being taken each bounce to evaluate the lighting, and using Cone sampling to perfectly sample each spherical area light from each hitpoint, with 256 samples-per-pixel being done, using stratified sampling and a ray depth of 6 diffuse bounces. Depending on the setup of the tests (single-threaded, multi-threaded, etc), I’ll render the scene at resolutions of: 150 x 150, 300 x 300 and 600 x 600.

Example Maze Lighting scene

For all tests, I’ll run them on battery power and then mains power, in case that affects the clocking of the processors, and will also wait between test runs for the processor temperatures to be below 50 degC before running the next test, to try and reduce the impact of thermal throttling. However, for the longer-running multi-threaded Imagine rendering tests I’m fairly sure thermal throttling will start to take place on the Intel 2015 machine given how fast the fans start to spin in longer-running scenarios.

All tests will be run three times, and results below will show the mean average of those numbers.

Results

Single-threaded Mint VM interpreter:

Single-threaded Mint interpreter VM benchmarks, smaller values are better: Single-threaded Mint interpreter VM benchmarks

For these single-threaded tests, the M1 Pro has fairly modest (~11% faster in Test 1, ~16% faster in Test 2) improvements over the seven year old Intel i7, however as the Mint VM execution is often branch-prediction constrained within the main VM bytecode interpreter loop, I expect this doesn’t really allow the M1 Pro’s eight-wide design to really stretch its legs running these benchmarks, as there’s a limit to the amount of Instruction-Level Parallelism that’s achievable.

It’s interesting to see that there’s no apparent difference in performance between running on battery vs running connected to the mains: I was expecting to see a difference - even if only on the Intel i7 MBP - as I know my Thinkpad T480s running Linux shows significant performance differences between the two states, but Apple clearly don’t seem to be implementing that strategy.

Single-threaded Imagine rendering:

Single-threaded Imagine rendering benchmarks, smaller values are better: Single-threaded Imagine Render benchmarks

Things start to look more impressive for the M1 Pro in the single-threaded rendering tests, as here there are a lot more floating point calculations being done with a lot less branching, and there’s a fair amount of SIMD use (i.e. ray intersection / traversal code, which uses native SSE or Neon instructions for the respective platform architectures) allowing the M1 Pro to show what it can do.

For both the 150 x 150 resolution and 300 x 300 resolution renders, the M1 Pro machine completed the tasks in ~46% less time, so it’s close to twice as fast as the older Intel machine. Again, there was no apparent performance difference between running powered off the battery compared to running with the machines plugged into the mains power.

Multi-threaded Imagine rendering:

Multi-threaded Imagine rendering benchmarks, smaller values are better: Multi-threaded Imagine Render benchmarks

The multi-threaded rendering tests show that for the shorter-duration 300 x 300 resolution render, the M1 Pro was more than three times faster (render time was less than a 1/3 of the Intel MBP’s i7 render time), and for the longer-duration 600 x 600 resolution render this ratio was slightly better still - I suspect because of the fact the Apple M1 Pro has significantly better thermal performance than the Intel i7, so is not throttling as much - if at all - although watching the temps on the M1 Pro machine towards the end of the test shows it did reach 84 degC, so it’s possible throttling is happening a bit.

Conclusion

So for most code that’s likely not branch/dependency-bound, it would seem the almost one year-old M1 Pro being tested here is around three times faster than the seven year-old Intel i7 model (although the base M1 Pro model with two less performance cores will not be as fast), and for multi-threaded code that parallelises well it can be more than three times faster, with much better thermal performance (which means less fan noise) and battery life, which is exactly what you’d want from a laptop, and I think shows that the Apple M1 processors are something to be excited about.



Minimalist Faux 3D Relief Map Renders

Over the past year I’ve become somewhat interested in visualisations of maps and terrain representations, and in particular “artistic” renders of relief maps. I’ve tried making a couple of different types of renderings of maps in some form or another over the past few years with varying degrees of success, and over the past few months I’ve been progressively creating what I’m terming “Minimalist” relief maps, where a surface is displaced by a Digital Elevation Model terrain height map (often to an exaggerated extent compared to real-word scale) with fairly simple shading, but using light and shadows as a key element to provide a sense of the terrain topology and height in a stylistic way.

Minimalist relief map of Africa

These maps are a lot less manual-labour intensive to set up than the previous historical topographic ones I tried which required a lot of manual warping of the DEM images to match up with the historical map, so there’s a lot less effort (on my part anyway) to generate these more minimalistic style ones, and I still find them very nice to look at.

Minimalist relief map of Antarctica

I’ve experimented with various different types of shading, with the main two types I settled on being the diffuse colour mapped to a colour gradient, driven by:

  • the terrain height value (the Y coordinate height in worldspace)
  • the occlusion ratio in a hemisphere around the shading point

The latter of which I think I like best (and is what both renders above show) as it allows nice highlighting of the edges and gradients of steeper terrain, although it means the stronger colours are generally in darker more occluded areas which hides it a bit, and the renders are slower as well, as additional raytracing to calculate the surface point occlusion needs to be done during shading.



Displaced Height Historical Map Renders

In a repeat of being inspired to attempt my own version of some nice-looking “artistic” maps that I noticed online last year with population density maps which I attempted to copy (fairly successfully), I recently noticed some historical maps that had been rendered with an exaggerated displaced height to the surface, giving the impression of terrain when combined with lighting and shadowing from a renderer.

So again, I was keen to try and create my own copies of these, and to see how difficult it was.

I discovered the existence of the David Rumsey Map Collection which has a very large collection of digital scans of historical maps of locations around the world, which is very interesting to browse through, and provided a good starting point for the visual “historical” texture for the maps.

Many of them are in very good condition, but some of the more older and more interesting ones are (understandably) somewhat aged in appearance and have tape marks / rips in them to varying degrees. While it would have been possible to do some artistic fix-ups / colour-correction first, it wasn’t something I particularly wanted to get into for my initial attempts (I always like to get to the Rendering part as soon as possible!), so I just picked a few interesting looking historical maps which had minor ageing marks on.

Obtaining terrain heightfield DEM (Digital Elevation Model) data is relatively easy from several sources, and I decided on using GMTED2010 data in the end.

The tricky part was fitting the DEM terrain data to the historical map image, which - depending on the map and its source, likely has an unknown projection, a border around the edge, and as I found out could also have out-of-date or incorrect terrain for some maps which were more than 100 years old. I ended up having to grid-warp the DEM terrain image to fit the historical map which was going to be the visible texture, which is do-able, but a pretty time-consuming manual process in order to do a reasonable job.

With my initial renders using a perspective camera projection from single angles (normally the bottom), I was able to take a few short-cuts with alignment, but for full top-down orthographic camera projections in the future (which will look better), I’ll need to do a better job.

I then rendered them in Imagine, with the historical map image being used as the diffuse surface texture, and the warped DEM image being used as the displacement map.

Iceland Map Rendering

Alaska Map Rendering



Population Density Map Voxel Renders

A few weeks ago I noticed on various social media sites some creative and interesting renderings of 3D maps of countries showing their population density as colour-shaded bars extending vertically upwards proportional to the population density values, the original source of which was Alasdair Rae. They’re lit with a fairly low-angled soft light, simulating an early/late sun, which also allows the columns’ shadows to highlight magnitude as well.

I decided I’d try and see what would be required for Imagine to be able to create and render similar looking renders from scratch, using only Imagine and the original source data.

The original data was in GeoTIFF format, which is basically TIFF format but with specific metadata encoding geographic properties like projection and lat/long co-ordinates (which I ignored for the moment), and the data type is normally float32 or float64 (double). Imagine already had support for reading TIFF files in general, but not for reading values as full double-width (float64) floats, so I had to implement support for that.

I then created a Scene Builder plugin (a plugin in Imagine’s UI which can procedurally generate scenes based on input parameters) to generate geometry consisting of small cubes/cuboids based off the data values in the image files, using input image X position for the 3D X axis, input image Y position for the 3D Z axis and the population density values for the Y axis height. Values below a threshold would not generate any geo (i.e. in the ocean with no land).

I also had to merge multiple images together, as each source image represented smaller geographic areas, with boundaries across multiple images for some of the regions I was interested in. For this I wrote custom code as part of the Scene Builder plugin UI to do manual single file / batch merging based off index coordinates.

Functionality to then shade the materials (height falloff gradient, mixed with a 3D grid texture), as well as render the image already existed in Imagine, so I was then easily able to render these:

Example Population Density Map rendering of the UK

Example Population Density Map rendering of Australia

I ended up ‘cheating’ slightly by not creating full voxel representations of stacks of cubes for per cell columns, instead just using single stretched cuboids to save geometry memory.

One issue with these renders is they’re just using the original source data projection in image space mapped to 3D space, which for places far from the equator like the UK, ends up squashing things quite a bit: I’d need to re-project the data in QGIS to rectify that, which maybe I’ll do in the future, as I do think they look pretty nice.

Some of the source data also seems to have artifacts (extra non-existent land-mass based off rasterising the image input data) in coastal areas, so to render these more nicely, Imagine’s unlikely to be the tool to do these image-space data touch-ups.




Archive
Full Index

2024 (3)
2023 (7)
2022 (3)
2021 (5)
2020 (4)
2019 (7)
2017 (1)
2016 (2)
2015 (1)
2014 (9)
2013 (10)
2012 (7)


Tags List