interwebs Archives - Waking up in Geelong

On the radio talking trains with ABC Ballarat

Marcus Wong — Wed, 23 Nov 2022 04:30:00 +0000

It happened to me earlier this year and it’s just happened again – a missed call from a radio producer wondering whether I was free to chat on air the next morning on the topic of trains.

This time around it was ABC Radio Ballarat, who had seen my recent posts on the Ballarat line through Bacchus Marsh and curve easing for faster trains, and thought it would be of interest to their listeners.

I said yes, and so I was up early the next morning jabbering on about the history of the Ballarat line.

We’ve seen a lot of changes to the Ballarat train line over the last couple decades, but it’s only when you piece it all together that you see the sheer scale of the works that have been done.

Marcus Wong is an avid train fan and has been writing about the Ballarat line over the past few weeks on his blog Waking Up in Geelong, unearthing some answers to the strange quirks in how it was built.

You can check me out at the ABC Radio website.

Or listen to it below.

https://wongm.com/wp-content/uploads/2022/12/ABC-Ballarat-2022-11-23-Marcus-Wong-Ballarat-rail-lines-strange-quirks-a-product-of-history.mp3

Unfortunately the recording cuts off abruptly at the end, but luckily it’s only the last 30 seconds or so.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post On the radio talking trains with ABC Ballarat appeared first on Waking up in Geelong.

Listening to the sound of my own voice

Marcus Wong — Thu, 26 May 2022 21:30:00 +0000

There is one peril to being the number one hit on Google for an obscure topic – radio producers looking for a talking head will try and chase you down to get you onto the air.

The story started on May 25, when somebody on Reddit posted a photo titled “I want to go down the forbidden ramp at Southern Cross Station. I’ve got no idea what’s down there, but I’m assuming dragons?” over at /r/Melbourne.

Now I’ve got a lot of photos online showing the old underpass beneath the station, so it wasn’t long before someone shared them to the thread.

Setting off the hits on my photo gallery.

By July 2021 piece on the remains of the Spencer Street Station subway also got a run, alongside my follow up piece Building the Spencer Street Station subway – a history.

Victorian Railways annual report 1961-62

Then the next morning something different – messages via various channels from a producer at ABC Radio Melbourne.

Hey Marcus

Is there a number I can call you on?

Love to chat to you about the Southern cross tunnels…

Anyway, I gave them a ring, and later that day I was on the radio blabbering on about the tunnels at Southern Cross Station.

Which was then followed by a handful of text messages and emails from friends and family who listen to ABC Radio and heard me on air.

You can listen to it at the ABC Radio website, or below.

https://wongm.com/wp-content/uploads/2022/05/aba-2022-05-26-marcus-wong-on-the-tunnels-under-southern-cross-station.mp3

Footnote

I even managed in to slip in a bonus piece into the interview – why the Western Ring Road takes a kink around Ardeer.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Listening to the sound of my own voice appeared first on Waking up in Geelong.

Getting ‘hugged to death’ by Hacker News

Marcus Wong — Tue, 05 Apr 2022 21:30:00 +0000

The story starts when I published a piece on the backyard approaching lighting at Adelaide Airport to my blog.

Later that day I noticed that my website was now running rather sluggishly, so checked the logs – an explosion in traffic.

And the reason – someone over at Hacker News had shared a link to it, and it was getting heaps of traffic.

I’m an occasional visitor to the site, which is a social news website like Reddit, but with a focus on computer science and entrepreneurship – so I was kinda surprised to see it getting a run over there.

Of course, given the tech background of the readers, discussion soon went off onto the ‘hug of death‘ all of the traffic was giving my poor web server.

As well as jokes about the poor state of Australia’s internet.

And fixing it?

I run my websites on a virtual private server (VPS) that I manage myself, so unfortunately for me I was on my to manage the flood of traffic.

My initial solution was the simplest, but also costly – just scale up my server to one with twice the CPU cores and twice the RAM.

That made my site more responsive, but I didn’t want to double my monthly web hosting costs, so it was time to get smart. These symptoms sounded exactly like my server.

If your VPS gets overloaded, and reaches the maximum number of clients it can serve at once, it will serve those and other users will simply get a quick failure. They can then reload the page and maybe have greater success on the second try.

This sounds bad, but believe me, it’s much better to have these connections close quickly but leave the server in a healthy state rather than hanging open for an eternity. Surprisingly you can get better performance from a server that has fewer child processes but responds faster than it is to have a server with more child processes that it is unable to handle.

I had to dig into the settings of Apache to optimise them for the resources my server had available.

Most operating systems’ default Apache configurations are not well suited for smaller servers – 25 child processes or more is common. If each of your Apache child processes uses 120MB of RAM, then your VPS would need 3GB just for Apache.

One visitor’s web browser may request 4 items from the website at once, so with only 7 or 8 people trying to load a page at the same time your cloud server can become overloaded. This causes the web page to hang in a constantly loading state for what seems like an eternity.

It is often the case that the server will keep these dead Apache processes active, attempting to serve content long after the user gave up, which reduces the number of processes available to serve users and reduces the amount of system RAM available. This causes what is commonly known as a downward spiral that ends in a bad experience for both you and your site’s visitors.

What you should do is figure out how much RAM your application needs, and then figure out how much is left, and allocate most of that to Apache.

I used the handy apache2buddy tool to analyse the RAM usage on my server, and calculate the maximum number of processes Apache should be allowed to spin up.

And since making these changes, the uptime of my websites has skyrocketed.

The status page found above is powered by the “Cloudflare Worker – Status Page” tool created by Adam Janiš.

Footnote: the ‘Slashdot effect’

Having your website taken down when a popular site links to you has been a thing for years – it’s called the ‘Slashdot effect‘ after one of the early social news websites of the 2000s – Slashdot.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Getting ‘hugged to death’ by Hacker News appeared first on Waking up in Geelong.

How many blog posts do I write in a year?

Marcus Wong — Thu, 16 Jul 2015 21:30:12 +0000

I’ve just sat down and run the numbers – if I continue at my current blog posting rate, after one year I will have published a total of 142 new entries!

My current posting schedule is as follows:

Two posts a week here (104 posts/year)
Once a fortnight about European railways at www.eurogunzel.com (26 posts/year)
Once a month on Hong Kong at www.checkerboardhill.com (12 posts/year)

As to how I managed to churn out so many blog posts, I don’t actually sit down at the same time every weekend and type out the posts for the next week. Instead, my workflow is as flows:

Step 1:

Add an entry to my ever increasing list of prospective blog post topics. Normally they are just links to newspaper articles, interesting reports in PDF format, or a collection of photos I’m intending to write more about.

Step 2:

Dig through my list of draft entries until I find something that grabs my interest, then start writing and further research.

Step 3:

Hit a roadblock and procrastinate. Writers block, a dead end on the research front, or a lack of photos are common causes.

Step 4 (optional):

Realise I have bitten off more than I can chew for one blog post, and spin off part of it into a future post. A variant of this is when I find other interesting bits and pieces while researching one subject, resulting in a new entry being added to my list of prospective topics.

Step 5:

Decide the post is finished, and put it into my pending articles pile.

Step 6:

Dig through my pending articles pile, and add them to my list of scheduled posts.

Step 7:

You eventually see the article online.

Footnote

So how long does my workflow take?

My recent ‘Fairness in PTV fare evasion penalties?‘ post started as a draft back in December 2014, and required three separate editing sessions to polish up.

My ‘Where does Geelong’s sewage go?‘ was a much bigger job, being almost two years in the making – I started it way back in August 2013, spent some time on it in December 2014, then polished it off in July 2015.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post How many blog posts do I write in a year? appeared first on Waking up in Geelong.

Fixing my blog robot

Marcus Wong — Sun, 24 May 2015 21:30:54 +0000

One thing you might not know about this site is that I don’t actually wake up each morning and type up a new blog post – I actually write them ahead of time, and set them up to be published at a future time. Unfortunately this doesn’t always work, such as what happened to me a few weeks ago.

I use WordPress to host my various blog sites, and it has a feature called “scheduled posts” – set the time you want the post to go online, and in theory they’ll magically appear in the future, without any manual intervention.

For this magic to happen, WordPress has to regularly check what time it is, check if any posts are due to be published, and if so, publish them – a process that is triggered in two different ways:

run the check every time someone visits the site, or
run the check based on a cron job (scheduled task)

The first option is unreliable because it delays page load times, and you can’t count on people visiting a low traffic web site, so the second option is what I put in place when setting up my server.

I first encountered troubles with my scheduled posts in early April.

Why has my blog robot been broken all week?

— Marcus Wong (@aussiewongm) April 2, 2015

My initial theory was that a recently installed WordPress plugin was to blame, running at the same time as the scheduled post logic and slowing it down.

Looks like my blog robot is working again (I think the ‘Broken Link Checker’ plugin was making the WordPress cron page time out)

— Marcus Wong (@aussiewongm) April 10, 2015

I removed the plugin, and scheduled posts on this site started to work again – I thought it was all fixed.

However, a few weeks later I discovered that new entries for my Hong Kong blog were missing in action.

Turns out my blog robot still isn’t working right, as it missed a post from April 8th.

— Marcus Wong (@aussiewongm) April 14, 2015

I took a look the the config for my cron job, and it seemed to be correct.

*/2 * * * * curl http://example.com/wp-cron.php > /dev/null 2>&1

I hit the URL featured in the command, and it triggered the publication of a new blog post – so everything good on that front!

I then dug a bit deeper, and ran the curl command directly on my server.

user@server:~$ curl http://example.com/wp-cron.php 301 Moved Permanently

Moved Permanently

The document has moved here.

Apache Server at example.com Port 80

Bingo – I had found my problem!

Turns out I had previous added a non-www to www redirect for the website in question via a new .htaccess rule – and by default curl doesn’t follow HTTP redirects.

The end result was my cron job hitting a URL, finding a redirect but not following it, resulting in the PHP code never being executed, and my future dated blog posts laying in limbo.

my fix was simple – update my cron job to hit the www. version of the URL, and since then, my future dated blog posts have all appeared on the days they were supposed to.

About the lead photo

The train in the lead photo is the Melbourne-Sydney XPT – on 11 July 2014 it derailed near North Melbourne Station due to a brand new but poorly designed turnout.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Fixing my blog robot appeared first on Waking up in Geelong.

Qantas bookings and a ‘4609 error’

Marcus Wong — Thu, 22 Jan 2015 20:30:21 +0000

The other week I headed up to Sydney, taking the train up on way, and a Qantas flight home. Unlike other airlines in Australia, Qantas still offers a full meal service on many of their domestic flights.

The meal service caught me out last time I flew with Qantas – I forgot to specify a vegetarian meal for my girlfriend, leaving her to chow down on bread rolls and chocolate bars!

This time she made sure that I made a special request, so I headed over to the Qantas website to add it against our booking. I entered her selection, and clicked save, only for this error page to appear:

It read:

Your reservation is confirmed, but your special request could not be processed for these flights. Please contact us for further information. (4609 – 0)

A very useless error message, made all the more useless because they don’t actually tell you how to contact them!

After finding the Qantas call centre phone number on their website, and waiting on hold for 15 minutes – I finally got an answer:

You can’t make a meal request for a flight that doesn’t offer a meal service

It least this time forgetting to specify a vegetarian meal wouldn’t have affected my other half, but it would have been nice if the website had have told me the issue upfront, instead of wasting time on the phone!

Footnote

On the day of the flight I got a call from Qantas – “we have a planeload of international passengers who are going to miss their connection – are you able to move to the 8pm flight?” I said yes, and when we boarded said flight we discovered that they *was* a hot dinner on offer.

Turns out Qantas flights departing between 6pm and 8pm offer a ‘dinner’ service, with refreshments at other times.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Qantas bookings and a ‘4609 error’ appeared first on Waking up in Geelong.

Tracing a performance issue on my web server

Marcus Wong — Mon, 05 Jan 2015 20:30:07 +0000

Managing my various web sites can be difficult at times, and my experience the other weekend was no different. My day started normally enough, as I logged onto my VPS and installed the latest security patches, then set to work on uploading new photos to my site. It was then I noticed my web site was taking minutes to load pages, not seconds, so I started to dig into the cause.

My initial setup

After I moved from shared web hosting, my collection of websites had been running on a $5 / month VPS from Digital Ocean – for that I got 1 CPU, 512 MB of RAM, and 20 GB of disk space. On top of that I used an out-of-the-box Ubuntu image, and installed Apache for the web server and MySQL for the database server.

I then installed a number of separate WordPress instances for my blogs, a few copies of Zenphoto to drive my different photo galleries, and then a mishmash of custom code for a number of other side projects. All of that is exposed via four different domain names, all of which sit behind the CloudFlare CDN to reduce the load on my server.

With some many web sites running on just 512 MB of RAM, performance was an issue! My first fix was to setup a 1 GB swap file to give some breathing room, which did stabilise the situation, but MySQL would still crash every few days when the server ran out of memory.

Swapping out Apache for the much less memory intensive Nginx web server is one way to fix the issue, but I didn’t have time for that. My solution – cron jobs to check the status of my server and restart the services as required!

The first script I came up with checked if the MySQL service was running, and start it up if it wasn’t.

service mysql status| grep 'mysql start/running' > /dev/null 2>&1 if [ $? != 0 ] then SUBJECT="MySQL service restarted $(date)" service mysql status|mail -s "$SUBJECT" me@example.com sudo service mysql start fi

My second attempt negated the need for the first script, as it checked to see how much memory was free on my server, and restarted Apache if it was less than a given threshold.

#Minimum available memory limit, MB THRESHOLD=300
available=$(free -m|awk '/^Swap:/{print $4}') if [ $available -lt $THRESHOLD ] then SUBJECT="Apache service restarted $(date)" service apache2 status|mail -s "$SUBJECT" me@example.com sudo service apache2 restart fi

Under normal load my cron job would restart Apache every day or so, but it did keep the database server up for the rest of the time.

Something is not right

After realising my web site was taking minutes to load pages, not seconds, I started to dig into my server logs. CPU load was hitting 100%, as was memory consumption, and my cron job was restarting Apache every few minutes – something wasn’t quite right!

My first avenue of investigation was Google Analytics – I wanted to find out if the spike in load was due to a flood of new traffic. While the Slashdot effect is a nice problem to have, but in my case it wasn’t to be – incoming traffic was normal.

I then took a look at my Apache access logs – they are split up by virtual host, so I had a number of log files to check out. The first suspicious entries I found were brute force attacks on my WordPress login pages – blocking those was simple, but the server load was still high.

Spending my way out

When looking to upgrade a system to handle more traffic, there are two completely different ways to go about it:

Be smart and optimise what you already have, to do more with the same resources
Throw more resources at the problem, and just ignore the cause

My server was already nearing the 20 GB disk space limitation set by Digital Ocean on their $5 / month VPS, so I figured an upgrade to next size VPS might fix my problem. Upgrading a Digital Ocean ‘droplet’ is simple job with their ‘Fast-Resize’ functionality – it takes about a minute, but in my case the option wasn’t available – I had to do it the hard way:

shut down my server,
create a snapshot of the stopped virtual machine,
spin up a new Digital Ocean server,
restore my snapshot to the new server,
point CloudFlare from my old server IP address to the new one.

All up it took around 30 minutes to migrate from my old server to my new one, but at least with CloudFlare being my public facing DNS host, I didn’t have to wait hours for my new IP address to propagate across the internet!

Unfortunately, the extra resources didn’t fix my problem – CPU load was still through the roof.

Digging for the root cause

I first installed the htop process viewer on my server, and was able to see that MySQL was using up far much more CPU than normal – presumably my caching wasn’t working right, and my web pages were having to be generated with fresh database queries each time.

Next I fired up a MySQL console, and had a look at the currently running queries. Here I noticed a curious looking query over and over again:

SELECT @serachfield ...

A check of the code deployed to my server indicated that the query was thanks to the search function in Zenphoto, and when I went back into my Apache access logs, I eventually found the problem – a flood of hits on my photo gallery.

Each line in the logs looked like the following:

108.162.250.234 – – [21/Dec/2014:04:32:03 -0500] “GET /page/search/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/beacon-3.newrelic.com HTTP/1.1” 404 2825 “https://railgallery.wongm.com/page/search/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/maintenance/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/nr-476.min.js” “Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648)”

Each request was bound for “http://js-agent.newrelic.com/nr-476.min.js” or other files hosted at newrelic.com, and the user agent always appeared to be Internet Explorer 8.

New Relic is a software analytics tool I have installed on my server, and on seeing the multiple references to it in my access logs, I remembered that I had updated my version of the New Relic agent just before my performance issues had started. Had I found a bug in it?

The cause

A check of the HTML source of the page in question showed a link to js-agent.newrelic.com embedded in the page, so I came up with the following explanation for the load on my server:

A user hits https://railgallery.wongm.com/page/search/SEARCH_TERM
The New Relic Javascript file at http://js-agent.newrelic.com/nr-476.min.js somehow gets loaded as a relative path, and not an absolute one, which results in a request to:
https://railgallery.wongm.com/page/search/SEARCH_TERM/js-agent.newrelic.com/nr-476.min.js
My server would then treat the above URL as valid, delivering a page, which then includes a relative link to js-agent.newrelic.com/nr-476.min.js a second time, which then results in a page request to this URL:
https://railgallery.wongm.com/page/search/SEARCH_TERM/js-agent.newrelic.com/js-agent.newrelic.com/nr-476.min.js
And so on recursively:
https://railgallery.wongm.com/page/search/SEARCH_TERM/js-agent.newrelic.com/js-agent.newrelic.com/js-agent.newrelic.com/nr-476.min.js

With the loop of recursive page calls for a new set of search results, each requiring a fresh database query, it was no wonder my database server was being hit so hard.

As an interim fix, I modified the Zenphoto code to ignore search terms that referenced New Relic, and then rolled back to the older version of the New Relic agent.

sudo apt-get remove newrelic-php5 sudo apt-get remove newrelic-php5-common sudo apt-get remove newrelic-daemon sudo apt-get autoremove newrelic-php5 sudo apt-get install newrelic-php5-common=4.15.0.74 sudo apt-get install newrelic-daemon=4.15.0.74 sudo apt-get install newrelic-php5=4.15.0.74

I then raised a support case for New Relic to look into my issue. In an attempt to reproduce the issue, I rolled forward with the current version of the New Relic agent to play ‘spot the difference’, but I couldn’t find any, and the errors also stayed away.

I’m writing this one off as a weird conflict between the updated New Relic agent running my server, and an old version of the browser monitor javascript file cached by a single remote user.

Conclusion

After working through my performance issues I now know more about what my web server is doing, and the extra RAM available following the upgrade means my horrible cron job hacks are no longer required to keep the lights on!

As for the steps I will follow next time, here are the places to check:

Google Analytics to check if I am getting a flood of legitimate traffic,
Apache access logs for any odd looking sources of traffic,
current process list to see where the CPU usage is coming from,
currently running MySQL queries for any reoccurring patterns.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Tracing a performance issue on my web server appeared first on Waking up in Geelong.

Rebuilding all of my websites

Marcus Wong — Wed, 09 Jul 2014 21:30:07 +0000

I’ve had quite busy recently – on Thursday last week I discovered all of my web sites were offline, which resulted in me moving to a new hosting provider, and rebuilding every bit of content. So how did I do it?

Going offline

I first realised something was wrong when I discovered all of my web sites displaying the following ominous error message:

I checked my email, and I couldn’t find any notification from my hosting provider that my account was suspended – a pretty shit job from them!

However, I wasn’t exactly surprised, as over the past few years I’ve been receiving these automated emails from their system:

Your hosting account with username: [XYZ] has over the last few days averaged CPU usage that is in excess of your account allocation.

This could be caused by a number of factors, but is most likely to be due to a misconfigured installation of a 3rd party script, or by having too many features, modules or plugins enabled on your web site.

If you simply have a very busy or popular web site, you may need to upgrade your account which will give you a higher CPU allocation. Please contact our support team if you need help with this.

Until your usage average drops back below your CPU quota, your account will be throttled by our CPU monitoring software. If your account continues to use more CPU than what it is entitled to, you risk having your account suspended.

All up I was running about a dozen different web sites from my single shared web hosting account, and over the years I’ve have had to increase the amount of resources available to my account to deal with the increasing load.

Eventually I ended up on a ‘5 site’ package from my hosting provider, which they were charge me almost $300 a year to provide – a steep price, but I was too lazy to move everything to a new web host, so I just kept on paying it.

Having all of my sites go offline was enough of a push for me to move somewhere new!

What needed to be moved

All up my online presence consisted of a dozen different sites spread across a handful of domain names, running a mix of open source code and code I had written myself. With my original web host inaccessible, I had to rebuild everything from backups.

You do have backups don’t you?

The rebuild

I had been intending to move my web sites to a virtual private server (VPS) for a while, and having to rebuild everything from scratch was the perfect excuse to do so.

I ended up deciding to go with Digital Ocean – they offer low-ish prices, servers in a number of different locations around the world, fast provisioning of new accounts, and an easy migration path to a faster server if you ever need it.

After signing up to their bottom end VPS (512 MB RAM and a single core) I was able to get cracking on the rebuild – they emailed me the root password a minute later and I was in!

As I had a bare server with nothing installed, a lot of housekeeping needed to be done before I could start restoring my sites:

Swapping over the DNS records for my domains to my new host,
Locking down access to the server,
Setting up a swap file,
Installing Apache, MySQL and PHP on the server,
Creating virtual directories on the server for each separate web site,
Creating user accounts and empty databases in MySQL

I’ve only ever played around with Linux a little, but after 30 minutes I had an empty page appearing for each of my domain names.

To get my content back online, thankfully I had the following backups available to me:

I run three blogs on the open source WordPress software, so I could just install that from scratch to get a bare website back
My main photo gallery on the open source ZenPhoto software, so that was another internet download
Each blog and photo gallery uses a custom theme, of which I had backups on my local machine to re-upload
I keep a mirror of my WordPress uploads on my local machine, so I just had to reupload those to make the images work again
When I upload new photos to my gallery, I keep a copy of the web resolution version on my local machine which I was unable to reupload
Every night I have a cron job automatically emailing me a backup copy of my WordPress and ZenPhoto databases to me, so my blog posts and photo captions were safe
Some of my custom web code is available on GitHub, so a simple git pull got those sites back online

Unfortunately I ran into a few issues when restoring my backups (doesn’t everyone…):

My WordPress backup was from the day before, and somebody has posted a new comment that day, so it was lost
I had last mirrored my WordPress uploads about a week before the crash, so I was missing a handful of images
The last few months of database backups for Rail Geelong were only 1kb in size – it appears the MySQL backup job on my old web host was defective
Of the 32,000 photos I once had online, around 2,000 files were missing from the mirror I maintained on my local machine, and the rest of them were in a folder hierarchy that didn’t match that of the database

I wasn’t able to recover the lost comment, but I was able to chase up the missing WordPress uploads from other sources, and thankfully in the case of Rail Geelong my lack of regular updates meant that I only lost a few typographical corrections.

As for the 2,000 missing web resolution images, I still had the original high resolution images available on my computer, so my solution was incredible convoluted:

Move all of the images from the mirror in a single folder
Use SQL to generate a batch file to create the required folder structure
Use more SQL to generate a second batch file, this time to move images into the correct place in the older structure
Run a diff between the images that exist, and those that do not
Track down the 2,000 missing images in my collection of high resolution images, and create a web resolution version in the required location

Three hours after I started, I had my first win.

And my blogs are back! #WebsiteRebuildSpeedrun http://t.co/UQ518Muwn2 http://t.co/iDiKdh8yMR http://t.co/SZVMGKGQNf

— Marcus Wong (@aussiewongm) July 3, 2014

Unfortunately I found a number of niggling issues throughout the night.

Now bashing my head against the wall trying to get mod_rewrite working #WebsiteRebuildSpeedrun

— Marcus Wong (@aussiewongm) July 3, 2014

Protip: page redirect rules won't work if your .htaccess file is empty #WebsiteRebuildSpeedrun

— Marcus Wong (@aussiewongm) July 3, 2014

Failure to install cURL prevented WordPress from sending comment notification emails #WebsiteRebuildSpeedrun

— Marcus Wong (@aussiewongm) July 4, 2014

By 2am I was seven hours in, and had managed to get another domain back online.

Now my horrible kludge of custom code is back online: http://t.co/uxJcmPh7V6 #WebsiteRebuildSpeedrun #allnighter

— Marcus Wong (@aussiewongm) July 3, 2014

Eventually I called it quits at 4am, as I waited for my lethargic ADSL connection to push an elephant up a drinking straw.

How long will it take to upload 5 GB of images? http://t.co/9j5DyxP6LZ #WebsiteRebuildSpeedrun

— Marcus Wong (@aussiewongm) July 3, 2014

I spent the weekend out and about so didn’t get much time to work on rebuilding my content – it wasn’t until the fourth day after my sites went down that I started to track down the 2,000 missing images from my photo gallery.

Thankfully I got a lucky break – on Monday afternoon I somehow regained access to my old web host, so I was able to download all of my missing images, as well as export an up-to-date version of the Rail Geelong database.

After a lot more stuffing around with file permissions and monitoring of memory usage, by Tuesday night it seems that I had finally rebuilt everything and running somewhat reliably!

What’s next

Plenty of people online seem to rave about replacing the Apache web server and standard PHP stack with Nginx and PHP-FPM to increase performance – it’s something I’ll have to try out when I get the time. However for the moment, at least I am back online!

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Rebuilding all of my websites appeared first on Waking up in Geelong.

News Limited and the ‘sslcam’ redirect

Marcus Wong — Mon, 07 Jul 2014 21:30:10 +0000

Recently I was in the middle of researching a blog post, when my internet connection crapped out, leaving me at an odd looking URL. The middle bit of it made sense – www.theaustralian.com.au – but what is up with the sslcam.news.com.au domain name?

https://sslcam.news.com.au/cam/authorise?channel=pc&url=http%3a%2f%2fwww.theaustralian.com.au%2fbusiness%2flatest%2fsmartphone-app-to-track-public-transport-woes%2fstory-e6frg90f-1226863043667

I then started researching the odd looking domain name, with the only thing of note being somebody else complaining about it:

What is that stupid sslcam redirect you get trying to load News stories? It's so stupidly slow.

— Richard Chirgwin (@R_Chirgwin) March 17, 2012

I then went back to the original link I clicked on, and followed the chain of network activity that followed.

First hit – the shortened link I found on Twitter:

http://t.co/wLP4Lj9kXP

Which redirected to the article on the website of The Australian:

http://www.theaustralian.com.au/business/latest/smartphone-app-to-track-public-transport-woes/story-e6frg90f-1226863043667

When then redirected me to a page to check for cookies – presumably part of their paywall system:

http://www.theaustralian.com.au/remote/check_cookie.html?url=http%3a%2f%2fwww.theaustralian.com.au%2fbusiness%2flatest%2fsmartphone-app-to-track-public-transport-woes%2fstory-e6frg90f-1226863043667

It then sent me back to the original article:

http://www.theaustralian.com.au/business/latest/smartphone-app-to-track-public-transport-woes/story-e6frg90f-1226863043667

Which then bounced me to the mysterious sslcam.news.com.au domain:

https://sslcam.news.com.au/cam/authorise?channel=pc&url=http%3a%2f%2fwww.theaustralian.com.au%2fbusiness%2flatest%2fsmartphone-app-to-track-public-transport-woes%2fstory-e6frg90f-1226863043667

And third request lucky – the original article:

http://www.theaustralian.com.au/business/latest/smartphone-app-to-track-public-transport-woes/story-e6frg90f-1226863043667

Quite the chain of page redirects!

The sslcam.news.com.au domain

Internet services company Netcraft have collated the following information:

Date first seen: January 2012
Organisation: News Limited
Netblock Owner: Akamai International, BV
Nameserver: dns0.news.com.au
Reverse DNS: a23-51-195-181.deploy.static.akamaitechnologies.com

Akamai Technologies is a company that runs a content delivery network used by many media companies use – their systems make websites faster to load by saving a copy of frequently viewed content to servers located closer to the end users.

As for the reason for the cascade of page redirects and the mysterious sslcam.news.com.au domain, I’m at a loss to explain it – sorry!

Footnote

The sslcam.news.com.au domain is also used by other News Limited websites – the Herald Sun also routes traffic to their website via it.

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post News Limited and the ‘sslcam’ redirect appeared first on Waking up in Geelong.

Wikipedia and railfan rumours

Marcus Wong — Wed, 18 Jun 2014 21:30:29 +0000

There is an old railway saying that goes “If you haven’t heard a rumour by lunchtime, then start a new one”. This leads to all kinds of harebrained discussion threads wherever railfans congregate, as well as Wikipedia articles such as this snippet I found the other day:

Tottenham Yard

Tottenham Yard was opened in the western suburb of Tottenham from the 1920s as part of a project to improve freight movement in Victoria. The majority of freight traffic in the state was from the north or western areas, and was being remarshalled into trains at Melbourne Yard. This caused inefficiencies with the large number of trains needing to enter the Melbourne city, so the yard was opened for the marshalling of trains before they were sent to Melbourne Yard.

Laid with broad gauge trackage, Tottenham is a gravitational yard with a slight slope from the Sunshine end towards the city. The yard consists of four groups of sidings: arrival roads, two groups of classification roads, and departure tracks. Heavy usage of the yard ended with the gauge conversion of the main line to Adelaide in 1995, and with the decline of broad gauge traffic in general, large areas of the yard are now used for wagon storage. Tottenham station is located to the south of the yard.

The part conversion of Tottenham Yard to standard gauge is expected to commence next year which will allow larger Standard Gauge freight trains to terminate at Tottenham with trip working from the yard to Melbourne and return.

What caught my eye was the “expected to commence next year” line in the final paragraph, which lacked any mention of the date when the statement was originally written. So when was this “partial conversion to standard gauge” supposed to have started?

Thankfully Wikipedia makes available the full edit history of each and every article available, which makes tracking down the source of the statement just a few clicks away – 19 July 2011!

Three years on, and nothing has happened on the partial conversion of Tottenham Yard to standard gauge front – yet another railfan rumour that came to nothing!

Post retrieved by 37.59.21.100 using Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36

The post Wikipedia and railfan rumours appeared first on Waking up in Geelong.