search engine optimisation Archives - Waking up in Geelong https://wongm.com/tag/search-engine-optimisation/ Marcus Wong. Gunzel. Engineering geek. History nerd. Mon, 29 Apr 2024 02:56:47 +0000 en-AU hourly 1 https://wordpress.org/?v=6.7.1 23299142 Yet another Google Image Search OCR adventure https://wongm.com/2024/05/yet-another-google-image-search-ocr-adventure/ https://wongm.com/2024/05/yet-another-google-image-search-ocr-adventure/#respond Mon, 13 May 2024 21:30:00 +0000 https://wongm.com/?p=22148 A few years ago I discovered that Google Image Search applies OCR to indexed images, enabling it to return results for text that have never appeared online, and I’ve found more examples over the years since. Well, now I’ve found yet another! Who’s that bus? Back in 2023 I photographed a Transit Systems coach, with […]

The post Yet another Google Image Search OCR adventure appeared first on Waking up in Geelong.

Post retrieved by 35.215.163.46 using

]]>
A few years ago I discovered that Google Image Search applies OCR to indexed images, enabling it to return results for text that have never appeared online, and I’ve found more examples over the years since. Well, now I’ve found yet another!

Who’s that bus?

Back in 2023 I photographed a Transit Systems coach, with registration plate 5629AO.

Transit Systems coach #200 5629AO on Geelong Road, Brooklyn

I wanted to see if I’d photographed this coach before, so I plugged ‘5629AO’ into Google Search – which didn’t return anything of mine.

But a photo dated 2015 on Flickr titled ‘Big White Bus‘, with registration plate 5629AO visible – the same bus photographed.

And an article in the Bendigo Advertiser dated 2014 titled ‘Bus driver trapped by live power lines after incident‘, illustrated with a photo of the bus involved – registration 5629AO.

As for the history of the bus itself, the Australian Bus Fleet Lists entry for 5629AO says it entered service in 2005.

And I didn’t think that would even work!

I ended up in a discussion recently around Australian bus manufacturing, where imported chassis have bodies built atop them locally, and was trying to find out which bus operators use Iveco chassis – so I figured I’d try searching my photo gallery to see if I could find any.

And lo and behold, as search for “wongm Iveco” actually turned up highly relevant entries.

From the first row – a photo captioned “Moonee Valley Coaches #92 2266AO on St Albans Road, Sunshine North” appears because the bus has an ‘Iveco’ badge on the front.

Moonee Valley Coaches #92 2266AO on St Albans Road, Sunshine North

Same applies to their photo captioned “HG Corporate Buses minibus XV95BF with luggage trailer at William Street and Flinders Lane”.

HG Corporate Buses minibus XV95BF with luggage trailer at William Street and Flinders Lane

“Hi-rail weed spray trucks at Somerton” too.

Hi-rail weed spray trucks at Somerton

“Moonee Valley Coaches #87 BS02AL at Swanston and Flinders Street”.

Moonee Valley Coaches #87 BS02AL at Swanston and Flinders Street

“Moonee Valley Coaches bus #82 9682AO on out of service at Moonee Ponds Junction”.

Moonee Valley Coaches bus #82 9682AO on out of service at Moonee Ponds Junction

“Moonee Valley Coaches bus #90 2264AO on route 506 along Glenlyon Road at Lygon Street”.

Moonee Valley Coaches bus #90 2264AO on route 506 along Glenlyon Road at Lygon Street

And “McKenzie’s coach 1829AO at the Southern Cross coach terminal”.

McKenzie's coach 1829AO at the Southern Cross coach terminal

I wouldn’t think that such a small piece of text like the badge on the front of a bus would get picked up by Google, but turns out they can do it.

Post retrieved by 35.215.163.46 using

The post Yet another Google Image Search OCR adventure appeared first on Waking up in Geelong.

]]>
https://wongm.com/2024/05/yet-another-google-image-search-ocr-adventure/feed/ 0 22148
Another OCR adventure with Google Image Search https://wongm.com/2021/09/another-ocr-adventure-with-google-image-search-q-train-queenscliff/ https://wongm.com/2021/09/another-ocr-adventure-with-google-image-search-q-train-queenscliff/#respond Mon, 06 Sep 2021 21:30:00 +0000 https://wongm.com/?p=18578 A few years ago I discovered that Google Image Search applies OCR to indexed images, enabling it to return results for text that have never appeared online. Well, now I’ve found another example. Some detective work The story started two months ago, when Victorian residents were allowed to go further than 5 kilometres from home […]

The post Another OCR adventure with Google Image Search appeared first on Waking up in Geelong.

Post retrieved by 35.215.163.46 using

]]>
A few years ago I discovered that Google Image Search applies OCR to indexed images, enabling it to return results for text that have never appeared online. Well, now I’ve found another example.

Some detective work

The story started two months ago, when Victorian residents were allowed to go further than 5 kilometres from home and dine at restaurants – I headed down to the Bellarine Peninsula to see The Q Train, where South African Railways class 24 steam engine 3620 was ready to haul the lunch session out of Queenscliff station.

South African Railways class 24 steam engine 3620 ready to lead The Q Train out of Queenscliff

While around the corner, I found this open air carriage parked in the sidings – the only identifier on the side being ‘OXK 34689’.

Observation carriage OXK 34689 for 'The Q Train' at Queenscliff

That wasn’t a wagon code I’ve heard of before, so I put it into a Google search – and got no web pages of note.

But the image search results turned up something interesting – a photo of the same carriage that I had photographed.

I clicked through the link, and ended up on the Facebook page for The Q Train.

And if you take a closer look at the photo – Google had indexed the ‘OXK 34689’ code on the side of the carriage.

But what about the carriage?

I ended up getting an answer the old fashioned way – asking a bunch of rolling stock nerds on Facebook whether they knew the back story of the carriage. The first lead:

It came from Cairns along with the South African steam locomotive.

That service ran for four months in 2004 between Cairns to Kuranda in Queensland.

Before lower than expected patronage numbers saw the service withdrawn. The rolling stock laying idle until 2020 when it found a new home at Queenscliff.

As for the identity of the wagon, I got a second lead.

Queensland Rail steel H class (HJS/HWA) open wagon is my guess.

That led me to a webpage on the ‘HSA / HWA Wagons’, created by someone modelling the Queensland Railways in H0n3½ scale.

Between 1965 and 1981, 550 general traffic open goods/freight wagons entered service on the QR Network. The wagons were much the same size as the previous HJS wagons that entered service in the early 1950’s.

And they broke down the wagon numbers.

Contract # 1. HSA Class # 33119 – 33268, built by Scott’s of Ipswich in 1965.

Contract # 2. HSAT Class # 34667 – 34799, built by Scott’s of Ipswich in 1971

Contract # 3. HWA Class # 35820 – 35969, built by Scott’s of Ipswich in 1972.

OXK 34689 happens to fall into the HSAT 34667 – 34799 number group, and the underframe looks much the same as the observation carriage at Queenscliff.

HSAT 34753 Rockhampton 5/1987
Norm Bray photo

So my mystery carriage OXK 34689 was once Queensland Rail open wagon HSAT 34689, built in 1972, acquired by Cairns Kuranda Steam in the early 2000s and converted into an observation carriage, placed into storage in 2004, and finally transported to Victoria in 2020.

And a plug for The Q Train

I went on The Q Train a few years ago, and it’s well worth the trip!

Dining car at one of end The Q Train

Even if you don’t hit the dance floor in the ‘Club Loco’ carriage.

'Club Loco' onboard The Q Train

Once lockdown is over, go treat yourself!

Post retrieved by 35.215.163.46 using

The post Another OCR adventure with Google Image Search appeared first on Waking up in Geelong.

]]>
https://wongm.com/2021/09/another-ocr-adventure-with-google-image-search-q-train-queenscliff/feed/ 0 18578
Google Image Search applying OCR to indexed images https://wongm.com/2018/08/google-image-search-applying-ocr-indexed-images/ https://wongm.com/2018/08/google-image-search-applying-ocr-indexed-images/#comments Mon, 06 Aug 2018 21:30:00 +0000 https://wongm.com/?p=8627 While doing some online research research I found evidence of some new functionality in Google Image Search - when crawling the web, Google is applying OCR (Optical Character Recognition) to the images that it finds, and uses this data in their search index.

The post Google Image Search applying OCR to indexed images appeared first on Waking up in Geelong.

Post retrieved by 35.215.163.46 using

]]>
While doing some online research research I found evidence of some new functionality in Google Image Search – when crawling the web, Google is applying OCR (Optical Character Recognition) to the images that it finds, and uses this data in their search index.

I was writing a post about the use of antimacassars onboard V/Line trains, so started researching the Australian supplier of the seat headrest covers.

BTN263 looking to the east end

My search term in Google was ‘merino headrex’, which only brought one relevant search result: a copyright application for the ‘Headrex’ name by Encore Tissue (Aust) Pty Ltd, owner of the ‘Merino’ brand.

Bing Search also delivered similar results for the same search query.

But when I flicked over to Google Image Search, something new appeared.

A photo of mine, that was what let me to search for ‘merino headrex’ in the first place.

But the spooky part – I had never put the words ‘merino’ or ‘headrex’ anywhere on my website.

So the most likely explanation – Google is applying OCR to the images that it finds, then adds the data to their search index.

More on Google and OCR

Over the years a number of Search Engine Optimization (SEO) blogs have speculated around Google’s search indexing capabilities.

From TechCrunch in January 2008:

A patent application lodged by Google in July 2007 but recently made public seeks to patent a method where by robots (computers) can read and understand text in images and video.

The extension of the application would be that images and video indexed by Google would be searchable by the text located within the image or video itself, a big step forward in indexing that has not previously been available.

Information Week suggests that privacy issues raised by Google Maps Street View will get more complicated as eventually YouTube videos will be indexable via the text that appears within them.

‘SEO by the Sea’ in November 2015.

I had some hope over the years that Google might get better at indexing text that appeared within links, watching some things like the following happen:

(1) Google acquired Facial and object recognition company Nevenvision in 2006, and a few other companies that can recognize images.

(2) In 2007, Google was granted a patent that used OCR (Optical Character Recognition) to check upon the postal addresses on business listings, to verify those businesses in Google Maps.

(3) Google was granted a similar patent in 2012 that read signs in buildings in Street Views images.

(4) In 2011, Google published a patent application that used a range of recognition features (object, facial, barcodes, landmarks, text, products, named entities) focusing upon searching for and understanding visual queries, which looks like it may have turned into the application for Google Goggles, which came out in September of 2010 – the visual queries patent was filed by Google in August, 2010, the nearness in time with the filing of the patent and the introduction of Google Goggles reinforces the idea that they are related.

But, Googlebot still doesn’t seem to be able to read text in images for purposes of indexing addresses, or to read images of text used in navigation. I added the text “Google Test” to the following image, and then ran it through a reverse image search at Google. The images returned were similar looking, but none of them had anything to do with the text I added to the image.

And ‘Search Engine Roundtable’ in March 2016:

A question was posed to Google’s Gary Illye’s on Twitter if Google’s crawler and indexer understands the text embedded in an image, maybe through OCR or other techniques. I am surprised to hear Gary say no.

A year is a long time on the internet.

Footnote on Google Image Search for obscure topics

Take a look at the other results from Google Image Search, and spot the odd one out.

My photo of the seat covers, and the Merino sheep make sense. But these three photos…

Railway tracks on a wharf.

Lead from Melbourne Yard arrives at wharves 1-4

People in hi-vis vests standing around a pile of wood.

Pile of wooden packing pieces used to get the derailed 8114 back onto the rails

And a train covered in a tarpaulin.

Sprinter 7012 still covered with a tarpaulin at the Dudley Street sidings

They have nothing at all to do with a merino sheep, but they do have one thing in common – they are hosted on the same domain as my ‘merino headrex’ image.

Thanks to lack of any other relevant results, Google’s algorithms decided that proximity to a relevant image is enough of a ranking signal to push it up the search result pages.

I’ve confused Google’s algorithms in this way before, with my Hong Kong themed blog at www.checkerboardhill.com/.

I think I've misled Google?

I searched Google for “Sheung Shui slaughterhouse” but was given my own photo of an Australian diesel locomotive!

Post retrieved by 35.215.163.46 using

The post Google Image Search applying OCR to indexed images appeared first on Waking up in Geelong.

]]>
https://wongm.com/2018/08/google-image-search-applying-ocr-indexed-images/feed/ 2 8627