[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
a fully-featured default process for machine learning- all the parts are here and have functional default values in place. Modify to your heart's delight so you can focus on the important parts for your dataset, or run it all the way through with the default values to have fully automated machine learning!
auto_ml- machineJS, but better!
I just built out v2 of this project that now gives you analytics info from your models, and is production-ready. machineJS is an amazing research project that clearly proved there's a hunger for automated machine learning.
auto_ml tackles this exact same goal, but with more features, cleaner code, and the ability to be copy/pasted into production.
Check it out! https://github.com/ClimbsRocks/auto_ml
machineJSprovides a fully automated framework for applying machine learning to a dataset.
All you have to do is give it a .csv file, with some basic information about each column in the first row, and it will go off and do all the machine learning for you!
If you've already done this kind of thing before, it's useful as an outline, putting in place a working structure for you to make modifications within, rather than having to build from scratch again every time.
machineJS will tell you:
If you haven't done much (or any) machine learning before- it does some fairly advanced stuff for you!
If you want to install this in it's own standalone repo, and work on the source code directly, then from the command line, type the following:
git clone https://github.com/ClimbsRocks/machineJS.git
pip install -r requirements.txt
git clone https://github.com/scikit-learn/scikit-learn.git
python setup.py build
sudo python setup.py install
node machineJS.js path/to/trainData.csv --predict path/to/testData.csv
We use the
data-formattermodule to automatically format your data, and even perform some basic feature engineering on it. Please refer to
data-formatter's docs for information on how to label each column to be ready for
machineJS is designed to be super easy to use without diving into any of the internals. Be a conjurer- just give it data and let it run! That said, it's super powerful once you start customizing it.
It's designed to be relatively easy to modify, and well-documented. The obvious place to start is inside
processArgs.js. Here we set nearly all the parameters that are used throughout the project.
The other obvious area many people will be interested in is adding in new models, and different hyperparameter search spaces. This can be found in the
pySetupfolder. The exact steps are listed in
machineJSworks on both regression and categorical problems, as long as there is a single output column in the training data. This includes multi-category (frequently called multi-class) problems, where the category you are predicting is one of many possible categories. There are no immediate plans to support multiple output columns in the training data. If you have three output columns you're interested in predicting, and they cannot be combined into a single column in the training data, you could run
machineJSonce for each of those three columns.
This library is well-tested on Macs. I've designed it to work on PCs as well, but I haven't tested that at all yet. If you're a PC user, I'd love some issues or Pull Requests to make this work for PCs!
Thanks for inviting us along on your machine learning journey!