OCTOBER 11-15, 2021

Carlos Giovanni Martinez Gutierrez: Discovery of structures in galaxy images with MinHashing, k-means and SIFT

Sep 22, 2021 (updated Sep 28, 2021)


Summary

Thanks to the technology used by telescopes, it has been possible to gather a large number of images of the universe. These images contain information about the numerous bodies that compose it, such as stars, nebulae, galaxies, and other celestial bodies of the cosmos. Therefore, analyzing them becomes a strenuous task for experts, and cooperation with other areas is needed, such as artificial intelligence and computer vision, whose branch of computer vision develops algorithms to carry out the analysis and comparison process, quickly and efficiently.

This work applies a computer vision method that uses algorithms and techniques to find patterns within a large collection of images of galaxies obtained from the Galaxy Zoo project. This dataset allows us to discover structures, that is to say, elementary parts that form a galaxy.

In the project, there are four main steps. In the first one, points of interest are detected and extracted from each of the images using algorithms such as SIFT and SURF, as well as a model based on neural networks known as DELF. In the second step, the points of interest are clustered using clustering algorithms. As a third step, potential structures are discovered using the points of interest and clusters obtained in the previous steps. Finally, the structures discovered in the images are visualized and analyzed in order to adjust the parameters to improve the results.

With this research we can prove that through deep learning, computer vision, and unsupervised learning it is possible to find interesting characteristics in the field of astronomy. Furthermore, this pipeline is applicable for other types of images.