Image of a galaxy

Machine Learning in Linux: astroML – statistical data analysis in astronomy and astrophysics

Last Updated on March 6, 2023

In Operation

A good way to start learning how to use the astroML module is to work through some of the many examples on the project’s website.

For example, let’s walkthrough the example which creates Hess diagrams of the Segue Stellar Parameters Pipeline (SSPP) data to show multiple features on a single plot.

Download the code using wget:

$ wget https://www.astroml.org/_downloads/33dfbd7e30005f392c3f866223a621d2/plot_SDSS_SSPP.py

Here’s the matplotlib output from the command:

$ python plot_SDSS_SSPP.py

Stellar Parameters Hess Diagram
Click image for full size

How about WMAP plotting with HEALPix? This uses the astromL.datasets.fetch_wmap_temperatures() functionality to download and plot the raw WMAP 7-year data.

We need to install the HEALPy package (an interface to the HEALPix pixelization scheme, as well as fast spherical harmonic transforms).

$ pip install healpy

Now’ll use wget again to download the Python code.

$ wget https://www.astroml.org/_downloads/7608268ca4f0563da5ca8ca87b372ce0/plot_wmap_raw.py

Here’s the matplotlib output from the command:

$ python plot_wmap_raw.py

astroML example

Here’s a summary of the tools which astroML offers:

  • Download and work with astronomical data sets.
  • Histogram tools.
  • Density estimation.
  • Linear regression and fitting.
  • Time series analysis:
    • Periodic time series.
    • Aperiodic time series.
  • Statistical functions.
  • Dimensionality reduction.
  • Correlation functions – AstroML implements a fast correlation function estimator based on the scikit-learn BallTree and KDTree data structures.
  • Filters.
  • Fourier and Wavelet transforms.
  • Luminosity functions.
  • Classification.
  • Resampling.

Summary

astroML is a treasure trove of statistical and machine learning routines for analyzing astronomical data in Python, loaders for several open astronomical datasets, and a large range of examples of analyzing and visualizing astronomical datasets. It extends the functionality offered by general-purpose libraries such as NumPy and SciPy.

The project provides multiple examples for deep learning using astronomical data.

Using astroML in conjunction with the awesome NumPy, SciPy, Astropy, and scikit-image will require some knowledge and experience. But these tools let you analyse the huge quantity of astronomical data and generate some amazing output.

astroML uses data from the Sloan Digital Sky Survey (SDSS), a decade-plus photometric and spectroscopic survey at the Apache Point Observatory in New Mexico.

Website: www.astroml.org
Support: GitHub Code Repository
Developer: Jacob Vanderplas
License: BSD 2-Clause “Simplified” License

astroML is written in Python. Learn Python with our recommended free books and free tutorials.

Artificial intelligence icon For other useful open source apps that use machine learning/deep learning, we’ve compiled this roundup.

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Steve
Steve
3 months ago

Python is serial task execution language so all python based solution for ML and NN’s are slow and that’s a fact. Is very hard next to impossible to a play hardware acceleration, using CUDA only if u code it for your self. Data mining is useless without hardware acceleration. Knime, RapidMiner, MatLab and some other solutions use it, others do not and that’s something to consider if one want hardware performance and no one tells u about. Everything python based is a non performance toy suitable for playing around with small datasets, great to start with. As soon one need performance python is not the answer. Until developers doesn’t speed up the code and add parallelism and hardware CPU+GPU hybridization to the system. Using CUDA with python also very very hard.

Jacob
Jacob
3 months ago
Reply to  Steve

I stopped reading after your egregious assertion that python ML and NN are slow. What a load of baloney.