In this release we introduce new models to easily predict and analyze user behavior directly on data collected using the open analytics taxonomy, and seamlessly switch between the full data set or a sample. No manual cleaning, transformations, exporting samples or complex tooling required.
The first model added to the open model hub is Logistic Regression, to predict user behavior such as:
- Will a user convert?
- Will a user start using a specific product feature or area?
- Will a user have a long active session duration?
Next to this, two new models are added to analyze which features are used the most in any stage of your analysis, and what users did before converting.
Data collected with Objectiv's tracker is very well-structured, which makes it ideal for various machine learning applications.
As a simple example below, we will predict if users on our own website will reach the modeling section of our docs, by looking at interactions that users have with all the other main sections of our website. We’ll use the simple dataframe below, which counts the number of clicks per user in each section of our website, using the root location:
See the example notebook for the intermediate steps of sampling the data, initializing the model, and fitting it. Note that for fitting the model, data is extracted from the database under the hood.
We can then create columns for the predicted values and labels in the sampled data set, and show the predictions (True if probability is >0.5):
features_set_sample['predicted_values'] = lr.predict_proba(X)
features_set_sample['predicted_labels'] = lr.predict(X)
# show the sampled data set, including predictions
Now that we have the model results, the data can easily be unsampled to work with the full data set, and its SQL exported to run in production:
features_set_full = features_set_sample.get_unsampled()
That’s all there is to running a logistic regression model on the full data set collected with Objectiv’s tracker, using the open model hub.
Top Used Product Features
The second model added to the open model hub is
enables you to understand which features are used the most in your full product, a subset of your product
(using the location stack), or a selection of users (e.g. new users).
As an example:
top_product_features = modelhub.aggregate.top_product_features(df)
... outputs the most used features overall:
You can narrow it down to a selection of users, e.g. new users:
df['is_new_user'] = modelhub.map.is_new_user(df)
top_product_features_new_users = modelhub.aggregate.top_product_features(df[df['is_new_user']])
# Slice on the blog
top_product_features_blog_section = modelhub.aggregate.top_product_features(df[df.root_location == 'blog'])
Top Product Features Before Conversion
The final model added to the open model hub is
top_product_features_before_conversion. It calculates what users
did before converting. You can specify which Event represents conversion, and optionally a subset of the
location stack you want to know about.
As an example we can calculate which features were most used before clicking a link leading to our blog:
top_features_before_conversion = modelhub.agg.top_product_features_before_conversion(df, name='blog_press')
Similar to the
top_product_features model, you can also slice on subsets of
your product (using the location stack) or a selection of users.
How to get it
The three new models are now live in the open model hub. To use them, install the package from PyPI:
pip install objectiv-modelhub
If you already have the package installed, don't forget to upgrade:
pip install --upgrade objectiv-modelhub
If you want to test run these models on your own product without worrying about the Ops part, reach out to us to get a Launchpad: a free-to-use, fully managed Objectiv back-end and data store without any setup or configuration.
Introducing: Release Office Hours
If you have any questions about this release or anything else, or if you just want to say 'Hi!' to team Objectiv, we now have Release Office Hours every Thursday at 4pm CET, 10am EST that you can freely dial in to. If you're in a timezone that doesn’t fit well, just ping us on Slack and we'll send over an invite for a better moment.