Last Updated on March 6, 2023
In essence, Machine Learning is the practice of using algorithms to parse data, learn insights from that data, and then make a determination or prediction. The machine is ‘trained’ using huge amounts of data.
In other words, Machine Learning is about building programs with tunable parameters (typically an array of floating point values) that are adjusted automatically so as to improve their behavior by adapting to previously seen data.
astroML is a Python module for machine learning and data mining built on NumPy, SciPy, scikit-learn, matplotlib, and Astropy.
The aim of the project is to offer a repository of Python implementations of common tools and routines used for statistical data analysis in astronomy and astrophysics, and to provide a uniform and easy-to-use interface to freely available astronomical datasets.
Installation
A fresh installation of Ubuntu 22.10 is missing git. Let’s install that first:
$ sudo apt install git
We will install astroML from its source code. Clone the project’s GitHub repository.
$ git clone https://github.com/astroML/astroML
Change into the newly created directory with the command:
$ cd astroML
We will install astroML system-wide:
$ sudo python setup.py install
We normally recommend installing software without polluting a system. Software such as Anaconda and Docker are popular software for this task. If you install Anaconda, you can then install the software using conda. There’s a conda package available.
$ conda install -c astropy astroML
Your system needs:
- Python version 3.6+
- Numpy >= 1.13
- Scipy >= 0.19
- Scikit-learn >= 0.18
- Matplotlib >= 3.0
- AstroPy >= 3.0
You may also need some additional packages:
$ sudo apt-get install dvipng texlive-latex-extra texlive-fonts-recommended cm-super
For example cm-super is needed for the type1ec.sty style sheet.
Next page: Page 2 – In Operation and Summary
Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary
Python is serial task execution language so all python based solution for ML and NN’s are slow and that’s a fact. Is very hard next to impossible to a play hardware acceleration, using CUDA only if u code it for your self. Data mining is useless without hardware acceleration. Knime, RapidMiner, MatLab and some other solutions use it, others do not and that’s something to consider if one want hardware performance and no one tells u about. Everything python based is a non performance toy suitable for playing around with small datasets, great to start with. As soon one need performance python is not the answer. Until developers doesn’t speed up the code and add parallelism and hardware CPU+GPU hybridization to the system. Using CUDA with python also very very hard.
I stopped reading after your egregious assertion that python ML and NN are slow. What a load of baloney.