Polars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust using Apache Arrow Columnar Format
Read moreCategory: Scientific
datatable – manipulate 2-dimensional tabular data structures
datatable is a Python package for manipulating 2-dimensional tabular data structures (aka data frames).
Read moreModin – drop-in replacement for pandas
Modin is a drop-in replacement for pandas. While pandas is single-threaded, Modin lets you instantly speed up your workflows.
Read morepandas – Python Data Analysis Library
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools.
Read moreNumPy – core package for scientific computing with Python
NumPy is the fundamental package for scientific computing with Python.
Read moreSciPy – Scientific Computing Tools for Python
SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering.
Read moreVaex – Multi-code Analysis Toolkit for Visualization and Exploration of Big Tabular Data
Vaex is a program and Python library to visualize and explore large tabular datasets. It can calculate statistics.
Read moreHoloViews – Make Data Analysis and Visualization Seamless
HoloViews is an open-source Python library designed to make data analysis and visualization seamless and simple.
Read moreDask – Advanced Parallelism for Analytics
Dask is a flexible parallel computing library for analytic computing. It takes a Python job and shares it across multiple systems.
Read moreOptimus – profile, clean, process and perform machine learning
Optimus is the missing framework to profile, clean, process and do ML in a distributed fashion using Apache Spark (PySpark).
Read moreyt – Multi-code Toolkit for Analyzing and Visualizing Volumetric Data
yt is an open-source Python package for analyzing and visualizing volumetric data. yt focuses on driving physically-meaningful inquiry.
Read moreAWS Data Wrangler – extends the power of Pandas library
AWS Data Wrangler extends the power of Pandas library to AWS connecting DataFrames and AWS data related services.
Read moreR – software environment for statistical computing and graphics
The R Project for Statistical Computing (R) is a free software environment for statistical computing and graphics.
Read moreMOA – software environment for data stream mining
MOA is a software environment for implementing algorithms and running experiments for online learning from evolving data streams.
Read moreOrange – data mining software
Orange is a component-based framework for machine learning and data mining. It includes a range of data visualization, and exploration.
Read more