With this release we've added several example notebooks that help you get started with the open model hub to analyze and model data. We've also released the Objectiv libraries to PyPI.
What is the open model hub?
It's a growing collection of open-source, free to use data models that you can take, combine, and run for a wide range of typical product analytics use cases, from your notebook.
It enables you to quickly cluster your users, explore your conversion, report on retention, and much more.
An example of a very common model: calculating average session duration. In the model hub, this is an
aggregation model that is called with model_hub.aggregate.session_duration()
. In
the example below, we've used this model, and also added filtering on new users only, using a
mapping model: modelhub.map.is_new_user()
.
Avg. daily session duration for new users in a notebook, with the model hub
With .head()
you can immediately query the data to show the results, and with
.to_pandas()
use all results as a pandas object in python.
Bach: the modeling library
The open model hub is powered by Bach: Objectiv's data modeling library. With Bach, you can fully compose models with familiar Pandas-like dataframe operations. It uses an SQL abstraction layer that enables models to run on the full dataset, and can export to SQL with a single command.
See for instance below for the underlying SQL query snippet generated by Bach for the
session_duration()
model, by calling display_sql_as_markdown(avg_duration)
:
[...several SELECTs and filtering]
"getitem_having_boolean___ddcf158ce4b9ea9d4318d5f57eb20bb5" as (select to_char("moment", 'YYYY-MM-DD') as "time_aggregation", "session_id" as "_session_id", min("moment") as "moment_min", max("moment") as "moment_max", ((max("moment")) - (min("moment"))) as "session_duration"
from "getitem_where_boolean___42ef48db14f4e03bdac4ba717cf7ee3c"
group by (to_char("moment", 'YYYY-MM-DD'), "session_id")
having (((max("moment")) - (min("moment"))) > '0')
)
select "time_aggregation" as "time_aggregation", avg("session_duration") as "session_duration"
from "getitem_having_boolean___ddcf158ce4b9ea9d4318d5f57eb20bb5"
group by ("time_aggregation")
Example notebooks
We now have several Jupyter notebook examples in our GitHub repository to get you started on the open model hub & Bach. For instance:
- How to use the open model hub for basic product analytics.
- How to use Bach with rich open analytics taxonomy data.
- How to use Bach for feature engineering.
- How to use Bach + sklearn for Machine Learning directly on the raw data in your SQL database, such as user clustering.
How to use it
For a quick-start, there’s a Docker-based demo that runs the full Objectiv pipeline on your local machine, including a Jupyter notebook environment with all the new example notebooks.
If you’re ready to use the open model hub and/or Bach in your notebooks, head on over to our
Modeling documentation. The package can now be installed with pip
:
pip install objectiv-modelhub
More models are coming
We’re continuously expanding the available set of models. One of the models that we’re working on right now is Impact Scores: scoring the impact of user interactions on your product goals, such as conversion.
To get updates on new models, star our repo.
Enjoy the new example notebooks!