Last Updated on July 25, 2023
scikit-learn is an open source Python module for machine learning built on top of SciPy. It offers efficient versions of a large number of common algorithms. The software displays a clean, uniform, and streamlined API, with good online documentation.
The software provides various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
scikit-learn is largely written in Python, with some core algorithms written in Cython to optimise performance.
scikit-learn is one of the most useful modules for machine learning in Python.
The software has the following dependencies:
- Python (>= 2.7 or >= 3.4)
- NumPy (>= 1.8.2)
- SciPy (>= 0.13.3)
scikit-learn also uses CBLAS, the C interface to the Basic Linear Algebra Subprograms library, and matplotlib to generate the supplied examples.
Features include:
- Simple and efficient tools for data mining and data analysis.
- Supervised learning algorithms – great coverage for this type of algorithm. There’s generalised Linear Modules, Linear and Quadratic Discriminant Analysis, Support Vector Machines, Decision Trees, Bayesian methods, Gaussian Processes, Neural network models, and more.
- Cross-validation – various methods to check the accuracy of supervised models on unseen data.
- Unsupervised learning algorithms – again, a wide range of algorithms are available, including Gaussian mixture modules, manifold learning, clustering, factor analysis, covariance estimation, density estimation, and more.
- Dataset transformations – provides a library of transformers, which clean, reduce, expand, or generate feature representations.
- Feature extraction – useful for extracting features from images and text.
- Various sample datasets – the datasets are useful in learning how to use scikit-learn. They include: Boston house prices dataset, iris dataset, diabetes dataset, digits dataset, linnerud dataset, wine dataset, and a breast cancer dataset.
- Built on NumPy, SciPy, and matplotlib.
Website: scikit-learn.org
Support: Documentation, GitHub, Gitter, Mailing List
Developer: scikit team
License: New BSD License
scikit-learn is written in Python. Learn Python with our recommended free books and free tutorials.
Return to Essential Python Tools
Popular series | |
---|---|
The largest compilation of the best free and open source software in the universe. Each article is supplied with a legendary ratings chart helping you to make informed decisions. | |
Hundreds of in-depth reviews offering our unbiased and expert opinion on software. We offer helpful and impartial information. | |
The Big List of Active Linux Distros is a large compilation of actively developed Linux distributions. | |
Replace proprietary software with open source alternatives: Google, Microsoft, Apple, Adobe, IBM, Autodesk, Oracle, Atlassian, Corel, Cisco, Intuit, and SAS. | |
Awesome Free Linux Games Tools showcases a series of tools that making gaming on Linux a more pleasurable experience. This is a new series. | |
Machine Learning explores practical applications of machine learning and deep learning from a Linux perspective. We've written reviews of more than 40 self-hosted apps. All are free and open source. | |
New to Linux? Read our Linux for Starters series. We start right at the basics and teach you everything you need to know to get started with Linux. | |
Alternatives to popular CLI tools showcases essential tools that are modern replacements for core Linux utilities. | |
Essential Linux system tools focuses on small, indispensable utilities, useful for system administrators as well as regular users. | |
Linux utilities to maximise your productivity. Small, indispensable tools, useful for anyone running a Linux machine. | |
Surveys popular streaming services from a Linux perspective: Amazon Music Unlimited, Myuzi, Spotify, Deezer, Tidal. | |
Saving Money with Linux looks at how you can reduce your energy bills running Linux. | |
Home computers became commonplace in the 1980s. Emulate home computers including the Commodore 64, Amiga, Atari ST, ZX81, Amstrad CPC, and ZX Spectrum. | |
Now and Then examines how promising open source software fared over the years. It can be a bumpy ride. | |
Linux at Home looks at a range of home activities where Linux can play its part, making the most of our time at home, keeping active and engaged. | |
Linux Candy reveals the lighter side of Linux. Have some fun and escape from the daily drudgery. | |
Getting Started with Docker helps you master Docker, a set of platform as a service products that delivers software in packages called containers. | |
Best Free Android Apps. We showcase free Android apps that are definitely worth downloading. There's a strict eligibility criteria for inclusion in this series. | |
These best free books accelerate your learning of every programming language. Learn a new language today! | |
These free tutorials offer the perfect tonic to our free programming books series. | |
Linux Around The World showcases usergroups that are relevant to Linux enthusiasts. Great ways to meet up with fellow enthusiasts. | |
Stars and Stripes is an occasional series looking at the impact of Linux in the USA. |