Central New South Wales Trip

A week ago I returned from a week’s trip to central New South Wales in Australia, driving from Sydney down to (almost!) Batemans Bay, then up again to Newcastle and then back to Sydney.

The aim was to spend a bit more time exploring places I’d skipped / missed when I motored through quickly on my trips passing through NSW in 2018 and 2019 - which I did end up doing, however it turned out I’d forgotten I had actually visited some places before! Jervis Bay Territory in particular - although this time around it wasn’t overcast, so I was able to see it (and the white sand beaches) with the sun out.

The weather was excellent, and once again I got some photos I’m pleased with.

Lake Illawarra before sunset:

Sunset looking out over Lake Illawarra, NSW

Green Patch Beach, Jervis Bay:

Green Patch Beach, Jervis Bay, NSW

Mollymook Beach sunrise:

Sunrise at Mollymook Beach, NSW

Fitzroy Falls:

Fitzroy Falls, NSW



Pointcloud Processing Tooling

For some of the more recent LIDAR maps I’ve been producing over the past six months - see this post with initial experimentation, and the resulting full-sized maps I’ve rendered so far can be found here - the source data has been in Pointcloud form rather than as a ‘flat’ raster image format (GeoTIFF) of just the DSM height values, and so it requires different processing in order to clean it up, and then convert it into a ‘displacement’ image map so I can render it as a 3D geometry mesh.

The below rendering is of Sydney, and uses a conversion from Pointcloud source data (available from https://elevation.fsdf.org.au/ as .laz files):

Sydney DSM map LIDAR render

The data being in Pointcloud format has both advantages and disadvantages over more simple raster height images: one of the main advantages is that there’s often (depending on the density and distribution of the points) more data per 2D area measure of ground, and each point has separate positions and attributes, rather than just being an average height value for the pixel as it is in raster image format. This means that theoretically it’s easier to “clean up” the data and remove “bad” points without destroying surrounding data. Another advantage is that being fully-3D, it’s possible to view the points from any direction, which is very useful when visualising and when looking for outliers and other “bad” points.

The disadvantages are that the data is generally a lot heavier and takes up more memory when processing it - three 8-byte double/f64 values need to be stored for the XYZ co-ordinates - at least when read from LAS files, as well as additional metadata per point (although there are ways to losslessly compress the memory usage a bit) - in addition to needing different tools to process than with raster images. Newer QGIS versions do now support opening and viewing .LAS/.LAZ Pointclouds (both in 2D and 3D views), although on Linux I’ve found the 3D view quite unstable, and other than being able to select points to view the classification, there’s not much else you can do to process the points, other than some generic processing which uses the PDAL tooling under-the-hood. It also appears QGIS has to convert the .LAS/.LAZ formats to an intermediate format first, which slows down iteration time when using other processing tooling side-by-side.

PDAL is a library for translating and manipulating Pointcloud data (similar to GDAL for raster data / GeoTIFFs, which QGIS uses for a lot of raster and vector operations under-the-hood), and it has quite a few useful features including merging Pointclouds (a lot of the source DSM Pointcloud data is only available as tiles of areas, and so the data needs to be merged to render entire cities, either before converting to a raster displacement map or after), filtering points, rejecting outliers and converting to raster image heightfields.

I have however found its memory usage somewhat ‘excessive’ for some operations, in addition to being slow (despite the fact it’s written in C++). Because of this - and also to learn more about Pointclouds and the file formats - I’ve started to write my own basic Pointcloud processing utility application (in Rust - the las-rs crate allowed out-of-the-box reading and writing of the .LAS/.LAZ Pointcloud formats which was very useful to get started quickly), which despite not really doing anything especially ‘fancy’ for some of the more simple operations like merging .LAS/.LAZ files - it just does a naive loop over all input files, filtering the points within based on configured filters and then accumulating and saving them to the output file - uses a lot less memory than PDAL does, and is quite a bit faster, so I’ve been able to process larger chunks of data with my own tooling and with faster iteration time.

The one area I haven’t tackled yet and am still using PDAL for is conversion to output raster image (GeoTIFF) - which I then use as the displacement map in my renders - however I hope to implement this rasterisation functionality myself at some point.

I am on the lookout for better Pointcloud visualisation software (in particular on Linux - a lot of the commercial software seems to be Windows or Mac only). QGIS’ functionality is adequate but not great, and is fairly lacking in terms of selection, and other open source software I’ve found like CloudCompare seem a bit unstable (at least when compiling from source on Linux), and it’s not clear how well it’d scale to displaying hundreds of millions of points at once.

I have found displaz which is pretty good for displaying very large Pointclouds (it progressively draws them, and seems to store them efficiently in memory), however it has no support for selection or manipulation of points (by design), so I’m still looking for something which caters to that additional need: in particular the selecting of outlier points interactively and culling them.



Simple Photo Collage Generation

Last week I implemented support for generating very simple (grid only) collages from photos/images in my image processing infrastructure.

Example Photo Collage of Pedestrian crossing lights in Wellington

I had wanted to create some simple grid-based collages of some photos, and I was somewhat surprised to discover that neither Krita nor GIMP (free/open source image manipulation software) seem to provide any built-in functionality to generate this type of output, without manually setting up grids/guides and resizing and positioning each image separately - which while not difficult in theory - is somewhat onerous, especially when you want to generate multiple similar collages from different sets of input images in a procedural/repeatable way.

I did briefly look into (free) web-based solutions, however I wasn’t really happy with that avenue either, partly due to most web solutions having the same lack of procedural/recipe generation (i.e. being able to just change the input images and get the same type of result without re-doing things from scratch again), but also because many web solutions seemed more targeted at “artistic” collages with photos having arbitrary positions and rotations, rather than having grid presets, as well as the fact that many (although not all) of the web apps in question required some form of registration or sign-in.

So I ended up just quickly implementing basic support for this collage generation myself in my image processing infrastructure, which took less than two hours, and means I can now generate arbitrary grid collages from a ‘recipe’ source parameters file which configures the target output resolution, the row and column counts, input images list, inner and outer border widths and border colour, as well as the image resize sampling algorithm/filter to use (i.e. bilinear, bicubic, Lanczos, etc) for resizing the input images to fit into the collage grid.



Great-Circle Route Visualisation with three.js

Great-Circle Route Image

Over the past few years, as well as learning the Rust programming language, I’ve also been attempting to learn more about the JavaScript ecosystem for front-end web development as well, although the seemingly large number of different libraries/frameworks available has made that a somewhat daunting task in terms of starting point. Because of this, I had decided not to just learn generic and popular web frameworks that were the most popular ones at the time of learning, but to explicitly use ones that seemed to match well with particular use-cases I had in mind.

In this case, I’ve been writing a basic web app for visualising Great-Circle Routes (link to app here) between common airports. There are various existing ones on the web in some form or another, but many produce only flat 2D visualisations, and while there are one or two 3D ones showing the routes on a 3D globe, they weren’t exactly to my liking, and regardless, it provided me with an interesting use-case with which to gain some more experience with JavaScript and web-based real-time rendering.

I did at first think about sticking to core vanilla JavaScript and using WebGL directly with no use of additional frameworks/libraries, but given I wanted something working fairly quickly, I decided that using a library would be the more efficient solution, and a quick look at https://threejs.org/ and its capabilities re-enforced that decision.

In the end, it was pretty simple to come up with a basic working example, with a sphere mesh object with a diffuse earth texture on, with an orbit-able camera, with “route” line meshes representing the Great-Circle Route and “flat” 2D route in question being drawn based off spherical coordinates in 3D space from source latitude and longitude coordinates. The Great-Circle Route between two points was calculated by using the Haversine Formula.

The most “complicated” thing in the implementation ended up being the Airport text labels: while it was simple enough to use THREE.CSS2DObject to construct high-quality DOM text items that appear in the correct 2D screen position based off the camera/earth position of where the respective airports are on the sphere, occlusion/visibility handling isn’t done automatically with this setup, as the DOM elements are not actually part of the 3D scene, they’re layered on top. So in order to hide the label text of airports when they’re occluded by the Earth sphere geo (i.e. when they’re around the back of the Earth to the camera), I ended up having to send Raycaster queries to detect the visibility of the position of the airports from the camera, and adjust the visibility of the Airport label CSS2DObject objects based off the result of these queries within the main render loop. This was still quite easy to do however, and seems to work well enough, although doing the raycaster queries within the render loop is probably not the most optimal solution in terms of efficiency, so I might need to look into better ways of handling that.

There’s still room for improvement though: the thickness of the route lines should ideally scale with the zoom level in screen-space, and the fixed colour of the lines means it’s not always easy to see them depending on the Earth texture being used underneath them, so I might have to look into some kind of contrast/blending setup in the future in order to make the route lines easier to see.



Timelapse Blending

Over the past few months I’ve made some attempts at timelapse photography, mainly motivated by seeing this site/software on High Dynamic Time Range Images which effectively “blends” multiple images taken at different times into one final image.

Rather than use the above software (which is written in Perl), I decided to write my own implementation using my existing image processing infrastructure I have, and have so far come up with a simple implementation that supports linear “equi-width” blending, and in the future I plan to implement more varied interpolations similar to the original software, as from experimentation, Sunrises/Sunsets and the progression from day to night are not often linear in the resultant brightness of captured images.

Scenes with many lights in that progressively turn on within the timelapse duration seem to work very well generally: here are two examples I’m fairly happy with, showing both non-blended and fully-blended examples of each.

San Francisco:

Time Blend of San Francisco

Time Blend of San Francisco

Wellington:

Time Blend of Wellington

Time Blend of Wellington

There do though appear to be some types of scenes that don’t always seem to work that well with this technique, in particular ones where the sun is either quite prominent or the sky gradient in the horizontal direction is very noticeable: this can lead to “odd”-looking situations where the image “slices” which show the sky should in theory get darker as you progress through time, but due to the sky colour gradient in the source images, it counteracts this on one edge of each image slice, looking a bit weird (at least to my eyes).

I also tried converting a sunrise timelapse sequence I took several years ago in Australia which had clouds moving very slowly across the sky horizontally in the frame, and this produced what almost looked like an artifact-containing/repeating-pattern image (it was technically correct and valid though) in that the same bits of cloud were repeatedly in each image slice by coincidence due to their movement across the sky being in sync with the time delay between each subsequent image.

Other things to look out for are temporal position continuity when blending (see the Wellington blended version with the boat masts moving between captures above), where things like people, vehicles, and trees vary position over time, meaning the blending leads to “ghosting” due to the differing positions in the adjacent images which are being blended/merged together.



Trip Photos

Two weeks ago I returned from a trip back to the UK for a few weeks, stopping off in San Francisco for a few days on the way out, and I have almost finished processing the DSLR photos I took, so this is just a quick post containing a single photo each from some of the locations I visited whilst away.

San Francisco:

Photo of San Francisco

Bristol:

Photo of Bristol

Bath:

Photo of Bath Crescent

Cambridge:

s College Cloisters

Chichester:

Photo of Chichester

Needles, Isle Of Wight:

Photo of the Needles, Isle Of Wight



Basic Apple M2 Pro CPU Benchmarks

This month I bought and received a new Apple MacBook Pro (M2 Pro, 14-inch, 2023) with the aim of replacing my own Apple MacBook Pro 15-inch (Intel) 2015 model which I bought in 2016. The MacBook Pro 15 still does work, but the battery life is awful now (I could have it replaced, which I’ve done several times with laptops in the past), the internal fans barely work, and the rubber around the screen is disintegrating, so with a trip back to the Northern Hemisphere planned next month, I thought it was time for a replacement.

A year ago I was provided (on loan) a work-provided MacBook Pro 14 M1 Pro (see previous benchmarks) which I’ve been using a bit, so I knew mostly what to expect in terms of performance and from the laptop in general, but I was curious to compare the performance of the M2 Pro against the M1 Pro (and the old Intel machine).

It’s not going to be a completely fair apples-to-apples comparison, as the work-provided MacBook Pro 14 M1 Pro processor is the 10-core version which has two extra performance cores than the baseline did - having eight performance cores and two efficiency cores - and the M2 Pro I’ve just bought is the baseline model - with six performance cores and four efficiency cores - but it should provide a rough indication of what performance to expect.

The Xcode / Apple Clang compiler versions are also different: My MacBook Pro 15 (Intel) is still running quite an old MacOS version, with an older compiler which I don’t want to update, and while I did install Xcode 14.3 on the 2021 MacBook Pro M1 Pro (as well as the command line tools) in order to attempt to match what I’d just installed on my new 2023 MacBook Pro M2 Pro, clang --version still shows version 13.1.6, whereas my new 2023 M2 Pro MacBook Pro shows 14.0.3 being used, so I’m not really sure what’s going on there, as Xcode -> About Xcode shows Version 14.3.1 as I’d expect on both MacBook Pro 14 machines (and both have the command line tools for that version of Xcode installed). The MacBook Pro 14s are both running MacOS Ventura 13.4.

Tests:

Copying what I did in the test last year, I’ll be using two of my apps as benchmarks: my Mint interpreter language VM (originally based off Robert Nystrom’s excellent Crafting Interpreters Lox language tutorial) but with additional functionality and performance improvements, which I’ll use to benchmark two Mint scripts as single-threaded tests, and also my Imagine pathtracing renderer, which has native SSE intrinsics support for Intel and native Neon intrinsics support for ARM, which I’ll run in both single- and multi-threaded scenarios.

Both apps will be compiled with -march=native on the Intel side and -mcpu=native on the Apple Silicon / ARM side, using the clang version on the machine in question, as well as optimisation level: -O3.

The two Mint script tests will be loop value calculation as Test 1:

var a = 1;
for (var i = 0; i < 100000000; i += 1)
{
    a = (i + i + 3 * 2 + i + 1 - 0.42) / a;
}
print a;

and a variation of Project Euler 21 to calculate the sum of all Amicable numbers under 15,000 as Test 2.

The Imagine rendering tests will render three different scenes in both single- and multi-threaded mode.

Example Maze Lights scene

The first render will be the same maze scene with spherical area lights that I used in the test last year (example image above), but with different settings: resolution will be 256 x 256, 256 samples-per-pixel will be used, but this time only one next-event light sample will be taken each path vertex. The general ray traversal and ray intersection will utilise SIMD, but the (fairly expensive) perfect light sphere sampling is scalar, and very unlikely to be vectorised by the compilers themselves.

Example SDF Julia Fractal scene

The second render will be a 450 x 338 resolution render of a Signed Distance Field primitive of a Julia Fractal (example image above, although with different settings), which is quite expensive to evaluate, and also does not have any SIMD utilisation for the SDF evaluation / intersection. There’s a physical Sky IBL in the scene as a light, and 144 samples-per-pixel will be used, with 3x3 Blackman Harris pixel filtering being used.

Example instanced mesh cube scene

The third render will be a 450 x 338 resolution render of 2,326,299 instanced mesh cubes in the (pre-calculated) shape of a Julia Fractal, which will fully-utilise SIMD instructions for ray traversal in the BVH and for ray / primitive intersections. Again, there’s a physical Sky IBL in the scene, 144 samples-per-pixel will be used, but this time no pixel reconstruction filtering will be used (so effectively Box 1x1).

These tests will all be done on (close to fully-charged) battery power - I discovered in the tests last year that Apple doesn’t seem to down-clock on battery power - and I will also wait between test runs for the processor temperatures to be below 50 degC before running the next test, to try and reduce the impact of thermal throttling.

All tests will be run three times, and results below will show the mean average of those numbers.

Results

Single-threaded Mint VM interpreter:

Single-threaded Mint interpreter VM benchmarks, smaller values are better: Single-threaded Mint interpreter VM benchmarks

For these single-threaded tests, the M2 Pro has a small improvement over the M1 Pro’s performance, which itself is around 10-16% faster than the eight year old Intel i7 processor. As mentioned last year however, I think this is very likely because the Mint VM execution is often branch-prediction constrained within the main VM bytecode interpreter loop, so there’s a limit to the amount of Instruction-Level Parallelism that’s achievable from the eight-wide M1 Pro and M2 Pro.

Single-threaded Imagine rendering:

Single-threaded Imagine rendering benchmarks, smaller values are better: Single-threaded Imagine Render benchmarks

The single-threaded rendering tests, which have a lot more floating point calculations and SIMD usage, show a small performance improvement for the M2 Pro over the M1 Pro. Intriguingly, the Maze Lights scene is the test with the biggest performance increase (almost 2x faster) from the Intel machine to the Apple M1 Pro: the other tests show slightly smaller gains, which I wouldn’t have expected. Without further microbenchmarks of various isolated parts of those tests, it’s difficult to guess why that might be, but the different render tests do exercise different calculations and code paths.

Multi-threaded Imagine rendering:

Multi-threaded Imagine rendering benchmarks, smaller values are better: Multi-threaded Imagine Render benchmarks

The multi-threaded rendering tests show that due to the fact the M1 Pro machine has eight performance cores and two efficiency cores, whilst the M2 Pro machine only has six performance cores and four efficiency cores, the M2 Pro machine is only very slightly faster in the SDF Julia Fractal render scene than the M1 Pro machine, and ties in the other two tests. Once again, the Maze Lights scene shows the biggest performance increase - almost 4x faster - from the Intel CPU to the Apple Silicon ones, with the other two tests showing a less dramatic difference (between 2x to 3x faster).

Conclusion

I think a not too bad showing for the baseline model: the M2 Pro can beat the M1 Pro by a small margin in all single-threaded tests, and can either just about equal or slightly beat the M1 Pro which has more performance cores (but two less efficiency cores) in the multi-threaded tests.




Archive
Full Index

2024 (3)
2023 (7)
2022 (3)
2021 (5)
2020 (4)
2019 (7)
2017 (1)
2016 (2)
2015 (1)
2014 (9)
2013 (10)
2012 (7)


Tags List