Release 0.0.16: Example notebooks for the open model hub

Ivar Pruijn

With this release we've added several example notebooks that help you get started with the open model hub to analyze and model data. We've also released the Objectiv libraries to PyPI.

What is the open model hub?

It's a growing collection of open-source, free to use data models that you can take, combine, and run for a wide range of typical product analytics use cases, from your notebook.

It enables you to quickly cluster your users, explore your conversion, report on retention, and much more.

An example of a very common model: calculating average session duration. In the model hub, this is an aggregation model that is called with model_hub.aggregate.session_duration(). In the example below, we've used this model, and also added filtering on new users only, using a mapping model: modelhub.map.is_new_user().

Avg. daily session duration for new users in a notebook, with the model hub

Avg. daily session duration for new users in a notebook, with the model hub

With .head() you can immediately query the data to show the results, and with .to_pandas() use all results as a pandas object in python.

Bach: the modeling library

The open model hub is powered by Bach: Objectiv's data modeling library. With Bach, you can fully compose models with familiar Pandas-like dataframe operations. It uses an SQL abstraction layer that enables models to run on the full dataset, and can export to SQL with a single command.

img/blog/releases/20220401/value-pandas-like-operations-on-full-dataset.svg

See for instance below for the underlying SQL query snippet generated by Bach for the session_duration() model, by calling display_sql_as_markdown(avg_duration):

[...several SELECTs and filtering]
"getitem_having_boolean___ddcf158ce4b9ea9d4318d5f57eb20bb5" as (select to_char("moment", 'YYYY-MM-DD') as "time_aggregation", "session_id" as "_session_id", min("moment") as "moment_min", max("moment") as "moment_max", ((max("moment")) - (min("moment"))) as "session_duration"
from "getitem_where_boolean___42ef48db14f4e03bdac4ba717cf7ee3c"

group by (to_char("moment", 'YYYY-MM-DD'), "session_id")
having (((max("moment")) - (min("moment"))) > '0')


)
select "time_aggregation" as "time_aggregation", avg("session_duration") as "session_duration"
from "getitem_having_boolean___ddcf158ce4b9ea9d4318d5f57eb20bb5"

group by ("time_aggregation")

Example notebooks

We now have several Jupyter notebook examples in our GitHub repository to get you started on the open model hub & Bach. For instance:

/img/open-graph/og-example-notebooks-model-hub.png

How to use it

For a quick-start, there’s a Docker-based demo that runs the full Objectiv pipeline on your local machine, including a Jupyter notebook environment with all the new example notebooks.

If you’re ready to use the open model hub and/or Bach in your notebooks, head on over to our Modeling documentation. The package can now be installed with pip:

pip install objectiv-modelhub

More models are coming

We’re continuously expanding the available set of models. One of the models that we’re working on right now is Impact Scores: scoring the impact of user interactions on your product goals, such as conversion.

To get updates on new models, star our repo.

Enjoy the new example notebooks!

Try Objectiv

Get Objectiv Up - Try Objectiv on your local machine (takes 5 minutes)
Objectiv on GitHub - Check out the project and star us for future reference
Objectiv on Slack - Join the discussion or get help