Project Level: Winter

Project Duration:

4 weeks – 20-36 hours per week. Applicant will be required on-site for the project.


The primary objective of this study is to explore the potential of unsupervised learning methods, such as Manifold learning, in identifying outliers in a sample of elliptical galaxies obtained from the Dark Energy Spectroscopic Instrument (DESI) survey. The sample was initially selected based on simple criteria of brightness and color, but about 50% of the measurements are likely to be flagged for removal as they may not belong to the sample. The traditional method of manual inspection of images and spectra for specific features is impractical due to the large amount of data available in upcoming surveys. Therefore, the proposed four-week project aims to analyze the data using machine learning methods and assess the accuracy of grouping the data into distinct classes based on the images. The study will also evaluate the degree of overlap between these classes and visual inspections, while exploring the possibility of using the latter as ground truth for training a machine learning algorithm.

Expected Outcomes:

The student working on this project will have a unique opportunity to gain invaluable experience in various aspects of astrophysics research. They will learn about data collection from large astronomical surveys such as the Dark Energy Spectroscopic Instrument (DESI), including the criteria for selecting samples based on brightness and color. They will also develop a thorough understanding of the challenges posed by a large amount of data and the limitations of traditional manual inspection methods. Through the project's focus on machine learning methods, the student will gain hands-on experience with state-of-the-art techniques such as unsupervised learning, specifically Manifold learning. They will explore the accuracy of using these techniques to group data into distinct classes based on the images and the degree of overlap with visual inspections. Additionally, they will investigate the possibility of using visual inspections as ground truth for training machine learning algorithms. This project will offer a unique opportunity to learn about a wide range of astrophysics topics and develop essential skills in data analysis and machine learning, which are highly sought after in today's job market.

Suitable for:

This project is ideal for individuals with a strong foundation in computer science, particularly in Python programming language, and an interest in astrophysics research. Basic knowledge of machine learning, data analysis, and data visualization techniques would also be beneficial. While prior experience in astrophysics is not required, a willingness to learn and a keen interest in the field would be advantageous. The project will involve using tools such as AstroML and scikit-learn, which are widely used in astrophysics research. Therefore, individuals with experience using these tools would be well-suited for the project. This project will offer a unique opportunity to develop skills in data analysis and machine learning while gaining exposure to astrophysics research. Additionally, the project will provide a platform to work collaboratively with experts in the field, thereby fostering professional networking opportunities.

Further Information:

If you are interested in this project or have any further questions, please do not hesitate to contact Dr Khaled Said. They would be happy to provide any additional information or clarifications that you may need.

Project members

Dr Khaled Said Soliman

Research Fellow
School of Mathematics and Physics