Today we have a guest post from Dr Eriita Jones and Professor Mark McDonnell. Eriita is a Planetary and Space Scientist, Research Fellow at the School of IT and Mathematical Sciences, University of South Australia, and an ECR member of the National Committee for Space and Radio Science. Her primary research areas are (i) the remote detection and characterisation of subsurface water environments on Mars and Earth, and (ii) quantifying the habitability of other planetary bodies. She is particularly interested in new computational data analysis techniques and in assessing the benefits of machine learning for space science. Mark McDonnell leads the Computational Learning Systems Laboratory at University of South Australia. He has published over 100 research articles in the fields of machine learning, computational neuroscience, and statistical physics. Mark has worked extensively with industry partners to deliver applied machine learning solutions in areas such as precision agriculture, recycling, and sports analytics. His research interests lie at the intersection of machine learning and neurobiological learning.
Artificial intelligence may get some bad press, but there are of course many tasks with which AI can provide tremendous benefit to human beings. One of the tasks that AI can be utilised for is called ‘image segmentation’, which is the process of automatically dividing an image into objects or categories so that every pixel in the image receives an associated label (e.g. car, dog, tree). This is essentially what the Planet Four citizen scientists are doing when they manually outline the boundaries to fans and blotches in polar springtime imagery from Mars. Just like a human being, in order to learn a new skill a machine needs to be taught (or ‘trained’) in the task it is being asked to perform. For state-of-the-art automated image segmentation, this training requires large amounts of data in the form of images with the categories of interest clearly labelled. In 2018, researchers at the Computational Learning Systems Laboratory at the University of South Australia in Adelaide, Australia, realised that large amounts of labelled imagery was exactly what the citizen scientists on the Planet Four project were generating. That was the start of a collaboration with the Planet Four Science Team. We wondered – could we teach an algorithm to automatically detect fans and blotches in Martian imagery? How well could a machine learn these complex features? And could the algorithm provide information which would assist the scientists in their study of these Martian phenomena?
The machine learning algorithms used here are examples of deep Convolutional Neural Networks (CNN’s) which generally perform very strongly on image segmentation problems. The algorithms are fed thousands of labelled fan and blotch images produced by the Planet 4 citizen scientists. After lots of exposure to what fans and blotches look like at different locations, years, solar longitudes, and resolutions, the algorithms become able to generalize from their experiences and apply their learning to new situations – in this case, unlabelled images that they have never seen before. In order to assess how well the machine learning techniques are performing, the algorithms are given a test. They are asked to predict where the boundaries of the fans and blotches are in some labelled images – but the algorithms are not shown the labels and have never seen those images before. We can then compare the machine’s predictions with the ‘correct answers’ – the manual labels drawn by citizen scientists. We compare with another method as well– a more traditional and less complex image classifier that does not employ machine learning. The figures below shows the output on a subset of one HiRISE image.
We are busily working on validating the output of the machine learning algorithms on a large number of images, but we can already see ways in which they can be very useful. Although the algorithms might not always find every fan or blotch in an image, they are very good at deciding whether there is at least one feature present. In other words, they do a good job at sorting out the images which have a fan or blotch, from those that have no fans or blotches at all. This is a very useful way of streamlining the presentation of images to the Planet Four Zooniverse platform – for example, instead of having to click through ‘featureless’ images the Planet FourTeam in future may wish to make sure that every image that appears will have a fan or blotch in it for labelling. Additionally, by automatically predicting the presence of fans and blotches in new images the algorithms provide early information on feature number and density that can allow the Planet Four team to be more selective in which images have the highest priority for manual labelling.
Could machine learning one day put citizen scientists out of a job? We don’t think this is very likely. The algorithms may eventually learn to perform very well on new images if those images are similar enough to ones they have seen before. But if they are shown an image that is very different (e.g with unusual lighting conditions, strange background terrain, or uncharacteristic fans and blotches), it is likely that the machine won’t be quite as good at segmentation as a well-trained human eye. So don’t worry citizen scientists, AI is just here to lend a hand – thanks for all the fabulous data, and stay turned for an exciting update in a few months!