last commit


The goal of this work is to develop a tool, named learningOrchestra, to
facilitate and streamline the data science iterative process of:

  • Gathering data;
  • Cleaning/preparing the datasets;
  • Building models;
  • Validating their predictions; and
  • Deploying the results.


The architecture of learningOrchestra is a collection of microservices deployed
in a cluster.

A dataset (in CSV format) can be loaded from an URL using the
Database API
microservice, which converts the dataset to JSON and later stores it in MongoDB.

It is also possible to perform several preprocessing and analytical tasks using
learningOrchestra’s collection of microservices.

With learningOrchestra, you can build prediction models with different
classificators simultaneously using stored and preprocessed datasets with the
Model Builder microservice. This microservice uses a Spark
cluster to make prediction models using distributed processing. You can compare the different
classification results over time to fit and increase prediction accuracy.

By providing their own preprocessing code, users can create highly customized model predictions
against a specific dataset, increasing model prediction accuracy.
With that in mind, the possibilities are endless! 🚀

Getting Started

To make using learningOrchestra more accessible, we provide the
learning_orchestra_client Python package. This package provides developers with all of
learningOrchestra’s functionalities in a Python API.

To improve your user experience, you can export and analyse the results using a MongoDB GUI, such as

We also built a
demo of learningOrchestra (in learning_orchestra_client usage example section)
with the Titanic challenge dataset.

The learningOrchestra documentation has more details on how to install and use it. We also provided documentation and examples for each microservice and Python package.

Read More

ترك الرد

من فضلك ادخل تعليقك
من فضلك ادخل اسمك هنا