Identity Resolution
The open model hub package provides configurable Identity Resolution to help you track users across different sessions and devices with very little effort. By default every event belongs to a session, and every session belongs to a user. However, this can easily be reconfigured to suit your application’s user identity needs, e.g. using internal IDs for logged-in users.
How to use Identity Resolution
Identity Resolution should be configured
on creating the Objectiv DataFrame
:
- By default cookie-based user identities are used; in this case, no need to specify anything.
- For more specific resolution, use the
identity_resolution
and (optionally) theanonymize_unidentified_users
parameters in theget_objectiv_dataframe
call.
We recommend not to track raw Personal Identifiable Information (PII). A simple way around that is to
hash the source data, and set the method in the id
field accordingly, e.g. md5(some_data)
.
How does it work?
Sessions are normally assigned to users that are identified by a cookie. In case the cookie isn’t available (e.g. deleted, user is on a different device, etc.), or not sufficient (e.g. because the user should be linked to an internal ID), the tracker needs some help. It should then be instructed to track the user identity explicitly, for example with a unique hash that identifies them persistently across devices when they log in.
An identity is tracked through an IdentityContext and contains two fields:
- The
id
field, storing the identification method; - The
value
field, containing the actual identifier, e.g. a unique User ID or email address.
See the Tracking section on how to configure the SDK for your platform to track this IdentityContext.
The method in the id
field can then be used to configure model hub (in the
get_objectiv_dataframe
call) to use all matching IdentityContexts
for identity resolution. This triggers the following flow:
- All cookie-based users get resolved to the last identity available for that user (filtered for the given method);
- Sessions are assigned to the new identity, with any parallel sessions (e.g. same user logged in on multiple devices) remaining intact;
- Sessions for users that do not have a new identity are either left alone, or they can be fully anonymized.
This behavior can be specified in the
get_objectiv_dataframe
call.