Brings SQL and AI together.
SQLFlow is a compiler that compiles a SQL program to a workflow that runs on Kubernetes. The input is a SQL program that written in our extended SQL grammar to support AI jobs including training, prediction, model evaluation, model explanation, custom jobs, and mathematical programming. The output is an Argo workflow that runs on a Kubernetes cluster distributed.
Try SQLFlow NOW in our playground https://playground.sqlflow.tech/ and check out the handy tutorials in it.
The current experience of development ML based applications requires a team of data engineers, data scientists, business analysts as well as a proliferation of advanced languages and programming tools like Python, SQL, SAS, SASS, Julia, R. The fragmentation of tooling and development environment brings additional difficulties in engineering to model training/tuning. What if we marry the most widely used data management/processing language SQL with ML/system capabilities and let engineers with SQL skills develop advanced ML based applications?
None of the existing solution solves our pain point, instead we want it to be fully extensible.
Here are examples for training a TensorFlow DNNClassifier model using sample data Iris.train, and running prediction using the trained model. You can see how cool it is to write some elegant ML code using SQL:
sqlflow> SELECT * FROM iris.train TO TRAIN DNNClassifier WITH model.n_classes = 3, model.hidden_units = [10, 20] COLUMN sepal_length, sepal_width, petal_length, petal_width LABEL class INTO sqlflow_models.my_dnn_model;
... Training set accuracy: 0.96721 Done training
sqlflow> SELECT * FROM iris.test TO PREDICT iris.predict.class USING sqlflow_models.my_dnn_model;
... Done predicting. Predict table : iris.predict
SQLFlow will love to support as many mainstream ML frameworks and data sources as possible, but we feel like the expansion would be hard to be done merely on our own, so we would love to hear your options on what ML frameworks and data sources you are currently using and build upon. Please refer to our roadmap for specific timelines, also let us know your current scenarios and interests around SQLFlow project so we can prioritize based on the feedback from the community.
Your feedback is our motivation to move on. Please let us know your questions, concerns, and issues by filing GitHub Issues.