Tasmania Trip

I got back from a nice trip to Tasmania last week, meaning I’ve now visited all of Australia’s states, and most of its territories (not Jervis Bay!).

The weather was not great, although it was mostly dry and there was sun occasionally, and I even found a bit of snow falling (and gathered) in the Central Highlands where it got down to 3 degC, as well as some on top of Mount Wellington in Hobart.

A surprise was driving through a region called Dorset: funny how names are re-used in other parts of the world by settlers.

A beach on the East coast of Tasmania

Table Cape, North coast of Tasmania

Central Highlands of Tasmania

Coming back to New Zealand via Sydney, it was very apparent (flying in and out) the number of bushfires in the area from the smoke in the air and the amount of haze and limited visibility.



HTTPS TLS Webserver support with S2N

For close to two years now I’ve been developing my own webserver, partly to re-learn about HTTP and web technology that I haven’t really been following too closely for over eight years, and therefore my knowledge of that area has atrophied, but also to be able to have a stand-alone webserver that I control and can customise to my needs within one stand-alone codebase, rather than having to use multiple technology stacks, which from my point-of-view is a huge benefit (much less configuration and things to learn/remember). My main use-case for having a webserver is developing my own photo gallery with photos described in various ways (by date, geographical location hierarchy, tags, camera type, etc) which off-the-shelf/available solutions don’t provide to the degree I want. I’ve now got a solution which supports HTTP 1.1 fairly robustly that I’m relatively happy with from a C++ technology perspective, although the JavaScript/HTML/CSS parts - of which I’m not very experienced - are pretty basic and likely less-than-ideal in terms of good practice and the best ways of doing things.

For hosting things on the public web, HTTPS support is generally a very good idea these days, and given there’s authentication / password functionality for my photo gallery, having HTTPS support is critical to prevent sniffing HTTP headers (among other security risks). Generally, it’s not the ideal situation to put a self-created webserver on the public internet directly with public ports open: it would better practice to put it behind a well-known and reliable web/proxy server like nginx or Apache - however I was curious what would be involved in implementing HTTPS secure socket functionality at both the HTTPS / TLS and TCP levels, so decided to start implementing HTTPS support a few months ago.

At first I had intended to go with OpenSSL directly, however a bit of research seemed to indicate that might not be the best solution given the complexity and breadth of OpenSSL’s functionality, and therefore the possibility of making mistakes with utilising it would be greater, leading to more potential bugs.

There were a few options I could have taken, although some of the software licenses did not entirely match my ideal choices in that area, but in the end I decided to use AWS’s s2n. It’s a minimal open-source TLS implementation, although it does rely on other libraries (i.e. OpenSSL, but it can support others) for the cipher side of things - but it’s simple to use, and seems to have sensible defaults.

Implementation wasn’t too difficult, although generating certificates for localhost is annoying for testing, and not all web browsers were happy with them (Chrome refused to accept them, but Firefox did). It’s also a lot more troublesome to debug what’s going on at the TCP/HTTP level, as you can’t as trivially connect WireShark and sniff what’s going on at the packet level (but that’s the whole point!).

So my webserver (creatively called ‘WebServe’) now supports both HTTP and HTTPS connections, the latter with certificates authenticating the connection, theoretically allowing perfectly secure connections between the web browser and the webserver. It seems to work perfectly with Firefox and Chrome on Linux and MacOS, however very occasionally Safari on iOS (but not MacOS) fails to download an image when keep-alive is enabled, so it’s possible there’s some subtle bug somewhere still.



New South Wales and Queensland Trip, 2019

Last week I returned from a trip to Australia, driving from Sydney up to Sunshine Coast in Queensland, and have just done a first pass of photo processing. The weather was generally excellent (although it was very windy in Byron Bay and Gold Coast, so walking on the beach was not amazingly pleasant with the sand blowing), and I thoroughly enjoyed the trip and took some photos I’m very pleased with, including several sunrise ones (getting up somewhat early was worth it!).

Sydney Sunrise:

Sunrise in Sydney

Newcastle Sunrise:

Sunrise in Newcastle, NSW

Lighthouse and rain:

Lighthouse and rain

Sunrise in Port Macquarie:

Port Macquarie Sunrise

Rough Seas at Cape Byron:

Rough Seas at Cape Byron



Photos from Trip

I’ve just got back from a trip back to the UK for a few weeks, stopping off in Hong Kong on the way out and Singapore on the way back, which was very enjoyable if also very tiring.

Despite the battery life issue I have with it, the new camera held up well, and I found the Live View functionality very useful for shooting on a tripod and focusing by touching on the screen.

I used this method to tick the box on some (somewhat cliché) location photos in both Hong Kong and Singapore (see below) that I did want to take, and I hope to return to both places in the future to explore more (although the evidence of political rumblings in Hong Kong was fairly apparent).

Hong Kong:

Night photo of Hong Kong

Isle of Wight, UK:

Photo of clock in UK

Singapore:

Photo of Singapore



Astrophotography Attempt

Last night I made a more serious attempt at astrophotography of the Milky Way than my previous ones, using my new camera and lens I got a few months ago.

In comparison to my previous astrophotography attempts in years past, I’m pretty happy with the result, but I still need to work out a combination of the best ISO vs noise level to use for single exposures, and whether it’s worth stopping down all the way to f/2.8 or not with the lens.

The Samyang AF 14mm f/2.8 lens I recently bought for this purpose and used I’m not amazingly happy with from a technical/optical perspective - but I did purposefully get it as the cheaper option given the expenditure I’d already spent on new gear back in March, and knew from reviews what its downsides would be ahead of time, so that’s all on me: the vignetting at f/2.8 is very pronounced, and no software currently seems to have the exact profile built-in to correct that, but also it’s sharpest at around f/3.5, and exhibits a fair amount of coma in the corners until f/4.0.

But still, there’s just something about wide-angle photos of the night sky that seems very magical, and even at f/2.8 to my eye at least, it can capture images I’m personally very happy with:

Milky Way photo

The above photo was taken with a Canon EOS 5D Mk IV, with a Samyang AF 14mm f/2.8 lens, at ISO 2500 with an exposure time of 25 secs at aperture f/2.8.



Fast Log Timestamp parsing for Timestamp delta checking

A year ago I started writing a new basic app (Sniffle) to find log files based off directory/filepath patterns, allowing for recursive directory pattern wildcard matching, and then performing file content operations on the found files, mainly involving grepping/counting items, with an emphasis on finding files on NFS networks. Existing applications like grep, ack and ag (amongst others) do provide existing functionality for some of the use cases, but their defaults are generally wrong for my use-cases (i.e. I want symlinks to always be followed, and to do recursive searching by default), and some of the methods they use to be fast are not always directly compatible (at least efficiently) with NFS networks (i.e. mmap-ing files for better content searching performance).

The number of logs I’d want to search through was often in the tens of thousands, and some of the logs are (unfortunately, and often needlessly) very verbose and can be > 5 MB in some cases. Using a log database or some other similar infrastructure would generally be the more principled way of solving this issue at scale, but I’m generally the only person who occasionally needs to perform these searches, so getting dedicated infrastructure (i.e. a database) for this use-case from IT was very unlikely, essentially meaning I ended up writing my own solution at home in my own time for use at work.

I’ve had working support for finding files, grepping through those files, counting string matches in files and finding multiple cascading strings (optionally in particular orders - i.e. if the first one isn’t found, there’s no point looking for any of the others) for a while now, as well as the ability to filter the list of found files to grep/process first by modified date (e.g. log files last modified within the past 5 days) or file size threshold, but a new use-case came up at work recently where I wanted to look for time period gaps in the log timestamps, indicating that the process writing the logs (a renderer) either hadn’t been doing anything for a while, or had taken longer than expected to do some task.

Checking the time delta between timestamps on consecutive lines in a file is pretty trivial to implement at a conceptual level, however doing so in a way which is performant is a bit more work: naively using strptime() to parse the time has a significant and noticeable overhead, due to internal use of mktime() which is very expensive, and even using sscanf() to pull out the constituent parts and rebuild them, or using a series of atoi() calls - while noticeably faster than strptime() - can be improved upon if you’re just extracting digit components in known positions (although supporting multiple date-formats complicates this slightly).

Given a fixed and valid timestamp string with consistent width zero-padding - which I could validate and guarantee in my use-case - I ended up settling on subtracting each string character value per timestamp unit digit by the char value '0', to give the 0-based 0-9 integer value, and then multiplying it by its digit position in the component number and adding these values together grouped by component number, effectively extracting and accumulating final values very quickly.

It’s somewhat hacky and verbose, but demonstrably faster than the more normal approaches mentioned above, which for my purposes made it a worthwhile optimisation.

As an example, given a std::string currentString; representing a non-empty log line which is guaranteed to have at least the minimum number of string characters for an ISO 8601 date/time stamp, and size_t timestampStart representing the character position offset within the string where the timestamp starts (in order to support varying formatting around the timestamp), with the start of a log line looking something like this:

[2019-03-22 09:42:13] Did something useful

then code to parse the year from the string looks like this:

uint64_t yearVal = (currentString[timestampStart] - '0') * 1000;
yearVal += (currentString[timestampStart + 1] - '0') * 100;
yearVal += (currentString[timestampStart + 2] - '0') * 10;
yearVal += (currentString[timestampStart + 3] - '0');

Extracting the month and day like this:

uint64_t monthVal = (currentString[timestampStart + 5] - '0') * 10;
monthVal += (currentString[timestampStart + 6] - '0');

uint64_t dayVal = (currentString[timestampStart + 8] - '0') * 10;
dayVal += (currentString[timestampStart + 9] - '0');

and handling the time components is done similarly with appropriate position indices.

Using these component integer values to represent a full time value is now context-dependent on what’s trying to be achieved: if you just wanted to sort the timestamps, you could just accumulate the numbers, multiplying each component by a component-respective multiplier to build the number of seconds, i.e.:

uint64_t currentTime = (yearVal * 365 * 31 * 24 * 60 * 60) + (monthVal * 31 * 24 * 60 * 60);
		currentTime += (dayVal * 24 * 60 * 60) + (hourVal * 60 * 60) + (minuteVal * 60) + secondVal;

In other words, using a constant of 31 for the number of days in all months: because we’d only care about relative positions based on numbers for sorting, and not absolute deltas, it wouldn’t be necessary to use the correct number of days in the month.

However, for the use-case of working out time durations between each log timestamp, absolute delta values are required, and this does involve knowing the number of days within each month - so that you accurately know the difference between 23:55 on the last day of a month is 10 minutes before 00:05 on the first day of the month, which makes things a bit more complicated. I ended up using two pre-calculated static const arrays of the months, one for non-leap year years, the other for leap year years, i.e.:

// pre-calculated totals for numbers of days from start of year for each month
static const unsigned int kCumulativeDaysInYearForMonth[12] =			{ 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 };
static const unsigned int kCumulativeDaysInYearForMonthLeapYear[12] =	{ 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335 };

and then working out whether the year was a leap year by cheating a bit and only checking the first timestamp on the first line of the logfile to see if it was a leap year, and then caching that in a variable for the rest of the log processing:

// exactly divisible by 400, not exactly divisible by 100
const bool isLeapYear = ((currentYear % 400 == 0) ||
                        (currentYear % 100 != 0)) && (currentYear % 4 == 0);
const unsigned int* pCumulativeDaysInMonth = (isLeapYear) ?
                        kCumulativeDaysInYearForMonthLeapYear : kCumulativeDaysInYearForMonth;

which then meant not needing to do any branching for any of the remainder log line timestamps to work this out, and being able to build up the number of seconds a timestamp represented with:

const unsigned int numDaysSinceStartOfYearToMonth = pCumulativeDaysInMonth[monthVal - 1];

uint64_t currentTime = (yearVal * 365 * 31 * 24 * 60 * 60) + (numDaysSinceStartOfYearToMonth * 24 * 60 * 60);
currentTime += (dayVal * 24 * 60 * 60) + (hourVal * 60 * 60) + (minuteVal * 60) + secondVal;

which then works to provide exact absolute counts of the number of seconds the timestamp represents for nearly all situations, except for two:

  • The first line of the log file having a timestamp almost two months before the end of February, e.g. December 30th, and some following log line timestamps in the same log file then progressing on to the end of February, potentially meaning there was a discrepancy between the code thinking the year was a leap year or not
  • Daylight Savings Time changes

Both of these situations in the end I decided to ignore: the first one because it was a total non-issue for my use-case, as if the length of the log files timestamps stretched to almost two months, straddling a new year and then on to the end of February, there would be much bigger issues to worry about: almost all log files should have had a duration of under 48 hours, and anything over two weeks long would be a pathological situation. For the Daylight Savings Time change, while it was a definite potential issue that could happen - either adding an extra hour or removing it from the time value in the hour after the change - which could then have tripped or not tripped the time delta threshold logic incorrectly, I was happy to let that problem slide: dealing with DST changes in computer systems is almost always problematic in my experience (especially if using local timestamps like here), and while it was technically solvable if you know the dates it changes (per geographic location: different countries change at differing times during the year, and the Earth hemisphere matters as well for direction), I just didn’t feel it was worth it for the potential of some false positives/negatives happening within two to three hours per year.



New Camera Gear

A few weeks ago I splashed out a rather alarming amount of money on a new DSLR camera (Canon EOS 5D Mk IV) and a new lens (Canon EF 24-70mm f/2.8L II USM), both upgrades from the previous versions of each. I certainly didn’t need new versions of either, and to some extent it was one of my fairly silly impulse purchases that I end up regretting after clicking the ‘Purchase’ button, but I’m travelling back to Europe for a few weeks in two months, and I wanted a camera with GPS built-in for geo-tagging photos, more megapixels (stitching panoramas of water doesn’t work perfectly), a use-able live-view, in addition to having better low-light performance. It will also likely guilt me into getting back into photography a bit more, which this blog post is also an attempt to do. I did semi-seriously think about jumping to mirrorless with Sony, but I do like Canon gear (they are behind technically currently though) and I have several Canon lenses, so it wasn’t an obvious win for me to make the switch.

I’m relatively happy with the new camera and lens: the new II version of the lens is noticeably shaper, although the vignetting falloff/gradient is also much more pronounced than with the old version, and the distortion’s different - although both of those can be corrected in software. The GPS geo-tagging is useful, but unfortunately I’ve found the battery life of my camera is really bad, as even with GPS totally off (there are two “Enabled” modes as well as fully “Disabled”), within a day of the camera being turned physically “Off” with the battery fully-charged in my camera bag and GPS mode set to “Disabled”, the battery’s consistently drained, and this is happening with multiple Canon batteries (including the one the camera came with), so I’m really not happy about that aspect. I think something must be wrong with my copy electronically, as a colleague has a Mark IV as well without the issue, and clearly most people online don’t seem to have the issue. At some point I’ll send it in to Canon to get it looked at and hopefully fixed, but pulling the battery out of the camera works around the long-term storage problem for the moment, and on trips abroad I’ll likely be charging batteries every night, so it’s not the end of the world.

Example Photo taken with new Camera and Lens

The exposure sensitivity of the Mark IV also seems quite different to the Mark III: I’m having to stop down one or two stops to match the Mark III’s levels - I guess the light metering is more accurate or something (although I’d argue photos are getting overexposed with it compared to my Mark III with neutral exposure), but that’s not really a problem if I just keep the setting stopped down on the camera to match the results I got with the Mark III (which seem more controlled and less over-exposed).

In a further act of semi-madness, I also ordered a Samyang AF 14mm f/2.8 lens which from reviews looked like it was one of the cheapest wide-angle primes for astrophotography that still had semi-reasonable performance, as I’m very keen to try and get into astrophotography after several previous failed attempts. I’ll be giving that a go when it arrives, and when the night sky clears up and the wind dies down (need a heavier tripod!).




Archive
Full Index

2024 (6)
2023 (7)
2022 (3)
2021 (5)
2020 (4)
2019 (7)
2017 (1)
2016 (2)
2015 (1)
2014 (9)
2013 (10)
2012 (7)


Tags List