GeoRabble Perth #6 – Big Ideas for Big Data

Leaving yesteryear to the yesteryears, this byte of GeoRabble was synced with the worldwide event that was Big Data Week – a “global community and festival of data”. Yes, the puns were international(ly bad)!

Ian McCleod

Ian McCleod giving us a tour on his yellow submarine

The MC of the night, Nicholas Flett kicked off the sold-out event by introducing us to Ian McCleod, who led us on a colourful subsea tour of the how’s and what’s of data collection from wreckages, artefacts and other curiosities resting on the seabed; including what to do when you encounter a Japanese tank at 38m below surface when you’re only licensed to dive 30m (you go the extra 8m to capture that data!). Emphasised was the importance of keeping pace with ever evolving technology to capture data – and that despite the challenges it might present, data capture is always worth the effort.

Catapulting us from an underwater world to the twittersphere, Nicholas introduced the next speaker, Tim Highfield, who gave us an insight into the how the thoughts and voices of people around world become Big Data through Twitter. Using a combination of open-source and in-house tools and methods, Tim described how Twitter data is captured and some of the analyses that can be done by examining hashtags. GeoRabblers were walked through fascinating examples, including on Australian politics (eg #auspol, #ausvotes), the Queensland Floods, Arab Spring, Eurovision, Tour De France and Occupy Oakland. It would seem that that the very nature of Big Data coming out of Twitter lends itself to a plethora of analytical dimensions, limited only by creativity of the researcher. Tim left the audience with a sobering question of how to determine whether we have too much data or not enough, and a poignant statement – that while media represents the first draft of history, Twitter is the first draft of the present. Tim’s slides and presentation are available online here.

Next up, blasting from the twittersphere to outer reaches of space and time, we were introduced to Kevin Vinsen who gave us a thrilling insight into the Square Kilometre Array (SKA) project. After a short and sweet introduction to space, light-years, the Big Bang and telescopes, GeoRabblers were presented with the enormity of the project data itself: enough raw data to fill 15million 64GB iPods every day! That’s almost 1 exabyte a day! The processing power required is estimated at 100 petaflops per second, about 50 times more powerful than the most powerful computer in 2010 (about equivalent to the processing power of 100 million PCs). If that’s not Big Data, I’m not sure what is! Based on development trends it is expected that a supercomputer with processing power required will exist by 2018. The aim of the radio telescope is to address fundamental questions about the Universe (how did stars and galaxies form after the Big Bang?  Is there life beyond Earth?). To help out with processing all this data, Kevin made mention of SkyNet (vaguely ominous name!), an initiative to pool together the processing power of personal computers connected to the internet to mimic the abilities of a supercomputer. What a wonderful opportunity for everyday citizens to contribute to a scientific endeavour of this calibre!

Onwards with the next speaker, Bryan Boruff, who delved into the semantic nature of this new buzz word that is ‘Big Data’, addressing the elephantine question of “What exactly is ‘Big Data’?”.  Bryan suggests that Big Data is data that is beyond the conventional or current methods of storing and handling. He went on to describe the dimensions in which Big Data increasingly presents itself – ‘four ‘V’s’: volume, variety, velocity and veracity. So how does one manage this situation? Bryan presented the paradigms of the familiar sequential ‘capture, store, analyse’ method of data handling and that of ‘automated epistemologies’, where data streams are analysed on the fly and not stored. But this presents a troublesome conundrum by contravening a basic scientific principle – that analyses/experiments must be able to be replicated, so therefore the data must be available. Tricky situation – will technology and people keep pace with data? Or will scientific method be challenged?

Wrapping up the presentations of the evening, Paul Farrell introduced GeoRabblers to “Big D”, the cool way to refer to Big Data, and kicked off with the interesting factoid that Big Data is apparently now the number one buzzword since ‘Y2K’, back when everyone thought the world would grind to a halt when the calendar ticked over to 2000. Paul went on to describe that the word ‘data’ is related to the Latin word for ‘fact’, and that it is not necessarily equivalent to information (DIKW Pyramid, anyone?), and that part of the phenomenon that is Big D, is the increasing ‘datafying’ of world – the metrication of more and more aspects of life into data, trying to address the ‘unknown knowns’. Adding to this is the increasing ability of science to not just acquire a sample of data, but the entire ‘population’ of it. Paul goes on to describe that increasingly, Big Data is more about distribution rather than analytical products – and that, in some respect, people’s own minds are the supercomputers. Returning to the seas, Paul left GeoRabblers of the night with an delightful anecdote of life of the sailor Matthew (Fontaine) Maury, “Pathfinder of the Seas” in the 1800s; under whose direction hundreds of ships’ logs were turned into data and locations charted (at least 1.2 million points!) to create Wind and Current Charts, which became an indispensable tool to mariners of all kinds. And that apparently, coincidently, the job title of those people going through the logs to capture the points was….. ‘Computer’.

GeoRabble Perth Crew

(most of) The GeoRabble Perth Crew

Many thanks to Darren Mottolini for his time and efforts in organising the event, especially coinciding it with the birth of his 2nd child – Congrats!, to the speakers of the night, to the talented MC Nicholas Flett and to the event sponsor Landgate.Landgatelogo

And in the words of Ian McLeod, “keep logging, keep mapping and good luck to you”.

GeoRabble http://www.georabble.org happens in various locations around Australia, is free and open to anyone, but frequently sells out.  If you would like to talk at a future Perth GeoRabble event, please send an email with the title and a short description to perth@georabble.org.

The next GeoRabble in Perth is June 20 and free tickets are available here



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s