Posted by

Brian Keng

on April 26, 2017

Share to

More Posts...

Data Science at Rubikloud

by Brian Keng | April 26, 2017

Over the last three years, Rubikloud has had some tremendous growth going from a team of less than a dozen to a fast-growing venture-backed startup with more than 80 people.  In this short time, we’ve assembled a team of talented engineers, retail experts and, of course, incredibly bright data scientists. With access to huge amounts of retail data spanning 10 countries and over a $100 billion in retail transactional data, Rubikloud is leveraging data science to automate the thousands of decisions that retailers make on a daily basis.

At its core, Rubikloud builds machine learning SaaS products to service some of the world’s largest retailers.  This is important because both machine learning and software come first—not second, third, or fourth.  As a result, we’ve evolved our organization to ensure that data science is at the heart of what we do.

A Data Science-Centric Organization

As a machine learning SaaS company, Rubikloud has been explicitly structured around the main tasks of a data science product.  Everything from client interfacing to building models, from automating deployment to visualization is taken care of by teams specializing in their specific function.

Our thinking is that a single data scientist is almost never an expert in all of these functions (the so-called data science “unicorn”).  Instead, they are usually strong in one or two of these areas such as modelling or exploratory analysis and a generalist in other areas to support the various aspects of a data science project.  We think the key to unleashing the potential of data science is building an organization that supports all aspects of a data science project.

The Rubikloud Data Scientist

At Rubikloud, we’re not hunting for unicorns: we’re looking for “T” shaped individuals with strong vertically deep experience in building robust and scalable predictive models, while also having wide breadth in order to contribute to other parts of a data science project.  These “T” shaped individuals are at the core of the company where we use big data and machine learning to provide practical solutions to retail-specific problems and pain-points.

Data Scientists at Rubikloud are “T” shaped (or more accurately “t-distributed”) individuals

With teams built around each of the functions in a data science project, Rubikloud data scientists eschew the details of tasks such as ETL or front-end visualization and focus on what they’re best at — building scalable and robust predictive models.

A good example of support is our partners in crime, the Data Science Engineering team.  They are a team of dedicated software engineers whose primary purpose is to build the infrastructure and piping for our machine learning system.  They aren’t focused on implementing production models (that’s a data scientist’s job), rather they are focused on providing the framework, APIs, and infrastructure that allow data scientists to effortlessly create and deploy new models into production.  Their work enables data scientists to take their models from prototype to production in the lowest-friction manner possible.

Building the Future of Retail Data Science

There is no shortage of interesting problems in retail data science, here are just a few examples of areas we’re actively working on today:

  • Recommendation Systems: Finding the best combination of product, offer, and channel to send out to retailer customers.
  • Promotional Forecasting & Optimization: Predicting the sales of products on promotion and optimizing their mechanics and timing.
  • Customer Targeting: Finding the right set of customers to target under a set of given constraints such as marketing budget, past purchase history, and campaign theme.
  • Customer Lifecycle Modelling: Estimating the stage of a customer’s lifecycle (e.g. acquisition, retention, growth, churn) and what action to take in each stage.
  • Customer Replenishment: Building statistical models to determine when the customer buys periodic consumable products and the best time to send them an offer.
  • Product Assortment: Determining the right combination of products to sell across brands and categories given constraints such as product breadth, vendor relationships and trending products.

If any of this sounds interesting to you, we’re always looking for intellectually curious people with a passion for building practical data science solutions that have a huge impact on real-world business outcomes.  We have a diverse team of data scientists and we’re always looking for more: