Release: Run Objectiv in a Snowplow pipeline

Ivar Pruijn

This release fits data captured by both Objectiv and Snowplow into the same schema, allowing anyone to run Objectiv in an (existing) Snowplow pipeline.

Objectiv’s Global Contexts contain all general information that is relevant to the logged Event, such as the application and path it occurred in, and they now fully fit the schema that is used by Snowplow. See the image below, which shows how Global Contexts are now seamlessly mixed with Snowplow Contexts.

Objectiv Global Context data seamlessly mixed with Snowplow Contexts.

Objectiv Global Context data seamlessly mixed with Snowplow Contexts.

Now let’s see how you can work with both datasets.

1) Directly select Global Contexts when creating your Objectiv DataFrame

To work with Global Contexts, you now select which ones you want to use when you instantiate your Objectiv DataFrame, which are flattened out into the DataFrame itself. Selection is done using the name of the Context in lower case, without the word ‘Context’, ie. MarketingContext becomes marketing:

# instantiate the model hub, set the default time aggregation to daily
# and set the global contexts
modelhub = ModelHub(time_aggregation='%Y-%m-%d', global_contexts=['application', ‘marketing’])
# get a Bach DataFrame with Objectiv data within a defined timeframe
df = modelhub.get_objectiv_dataframe(start_date=2022-08-01, end_date=2022-08-31)

...which in this example results in a DataFrame that has both application and marketing columns that contain the JSON for the respective Global Context:

/img/blog/releases/20220907/global-contexts-flattened-df-head.png

2) Interact directly with Global Context properties

Because the Global Contexts are directly available in the Objectiv DataFrame, you can interact with their properties via the context accessor. For example, to get the ID of the application as a Series, you now use:

df.application.context.id

Similarly, the application ID can be set as regular (text) column in your DataFrame:

df['application_id'] = df.application.context.id

Similarly, from the MarketingContext, for example, you can also get its source property as a column:

df['marketing_source'] = df.marketing.context.source

3) Interact directly with Snowplow Contexts & properties

As the Global Contexts now fully fit Snowplow’s schema, the data can also be used in Snowplow analyses, so you can keep using your existing BI workflow.

For instance, here’s how you can query Objectiv data from a Snowplow analysis:

SELECT contexts_io_objectiv_context_application_context_1_0_0[OFFSET(0)].id as application_id 
FROM events

This also paves the way for deeper Snowplow integration the other way around: using your existing Snowplow Context data in Objectiv - stay tuned.

How to get it

Check out the Snowplow pipeline docs for how to set up a Snowplow pipeline.

To work with Objectiv, install the modelhub/Bach packages from PyPI:

pip install objectiv-modelhub

Or upgrade if you’ve already installed it:

pip install --upgrade objectiv-modelhub

For more info, see the open taxonomy example notebook for how to use the new Global Contexts.

To try Objectiv out in a lightweight setup, see how to run the Docker-based demo.

info

Office Hours

If you have any questions about this release or anything else, or if you just want to say 'Hi!' to team Objectiv, we have Office Hours every Thursday at 4pm CET, 10am EST that you can freely dial in to. If you're in a timezone that doesn’t fit well, just ping us on Slack and we'll send over an invite for a better moment.

Join the Office Hours

Try Objectiv

Get Objectiv Up - Try Objectiv on your local machine (takes 5 minutes)
Objectiv on GitHub - Check out the project and star us for future reference
Objectiv on Slack - Join the discussion or get help