# modelhub.ModelHub.aggregate

*property * aggregateβ

Access aggregation methods from the model hub. Same as `agg`

.

*class* Aggregateβ

(mh)Models that return aggregated data in some form from the original DataFrame with Objectiv data.

*static* drop_off_locationsβ

(data, location_stack=None, groupby='user_id', percentage=False)Find the locations/features where users drop off, and their usage/share.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β the column of which to create the drop-off locations. Can be a string of the name of the column in data, or a Series with the same base node as`location_stack`

`data`

. If None the default location stack is taken.- can be any slice of a
`modelhub.SeriesLocationStack`

type column. - if
`None`

, the whole location stack is taken.

- can be any slice of a
β sets the column(s) to group by.`groupby`

β if True calculate the percentage.`percentage`

#### Returnsβ

`bach.DataFrame`

with the location where users drop off, and the count/percentage.

### frequencyβ

(data)Calculate a frequency table for the number of users by number of sessions.

#### Parametersβ

** data** β

`bach.DataFrame`

to apply the method on.#### Returnsβ

series with results.

### funnel_conversionβ

(data, location_stack=None, groupby=None)Calculates conversion numbers for all locations stacks in the `data`

.
N.B. Filter the dataframe beforehand to filter down to the funnel locations.

For each step in a funnel, calculates the number of unique users who started it, the number of unique users who completed the step (defined as whether the user went to any other step in the funnel), the conversion rate to completing the step, the conversion rate to completing the step when looking at all users who started the funnel (= the βfullβ conversion rate), and the fraction of the users in the funnel dropping out at the given step.

N.B. We assumed that the funnel direction is always the same. The implementation of VisibleEvents makes for the most accurate calculation of the conversion numbers, as the number of users as well as the conversion rate is based on events on each location stack.

#### Parametersβ

β The`data`

`bach.DataFrame`

to apply the operation on.β The column that holds the steps in the funnel. Can be:`location_stack`

- A string of the name of the column in
`data`

. - Any slice of a
`modelhub.SeriesLocationStack`

type column. - A Series with the same base node as
`data`

.

- A string of the name of the column in

If its value is `None`

, the whole location stack is taken.

β sets the column(s) to group by. It would be also handy later for the filtering of the results.`groupby`

#### Returnsβ

`bach.DataFrame`

with the following columns: `step`

(the location considered as a
step, e.g. a feature or root location), `n_users`

(number of unique users starting the step),
`n_users_completed_step`

(number of unique users completing the step),
`step_conversion_rate`

(number of users completing the step / `n_users`

), `full_conversion_rate`

(number of users completing the step / number of users starting the funnel), and `dropoff_share`

(ratio between the users dropping out at a given step and users at the begging at the funnel).

### retention_matrixβ

(data, time_period='monthly', event_type=None, start_date=None, end_date=None, percentage=False, display=True)Finds the number of users in a given cohort who are active at a given time period, where time is computed with respect to the beginning of each cohort. The βactive userβ is the user who made an action that we are interested in that time period. Users are divided into mutually exclusive cohorts, which are then tracked over time. In our case users are assigned to a cohort based on when they made their first action that we are interested in.

Returns the retention matrix dataframe, it represents users retained across cohorts:

- index value represents the cohort
- columns represent the number of given date period since the current cohort
- values represent number (or percentage) of unique active users of a given cohort

One can calculate the retention matrix for a given time range, for that
one can specify start_date a/o end_date.
N.B. the usersβ activity starts to be traced from the first date the user is seen in the `data`

.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β can be βdailyβ, βweeklyβ, βmonthlyβ or βyearlyβ.`time_period`

β the event/action that we are interested in. Must be a valid event_type (either parent or child). if None we take all the events generated by the user.`event_type`

β start date of the retention matrix, e.g. β2022-04-01β if None take all the data.`start_date`

β end date of the retention matrix, e.g. β2022-05-01β if None take all the data.`end_date`

β if True calculate percentage with respect to the number of a users in the cohort, otherwise it leaves the absolute values.`percentage`

β if display==True visualize the retention matrix as a heat map`display`

#### Returnsβ

retention matrix bach DataFrame.

### session_durationβ

(data, groupby=NotSet.token, exclude_bounces=True, method='mean')Calculate the duration of sessions.

With default `method`

, it calculates the mean of the session duration over the `groupby`

.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β sets the column(s) to group by.`groupby`

- if not_set it defaults to using
`ModelHub.time_agg`

. - if None it aggregates over all data.

- if not_set it defaults to using
β if True only session durations greater than 0 will be considered`exclude_bounces`

β βmeanβ or βsumβ`method`

#### Returnsβ

series with results.

### top_product_featuresβ

(data, location_stack=None, event_type='InteractiveEvent')Calculate the top used features in the product.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β the location stack`location_stack`

- can be any slice of a
`modelhub.SeriesLocationStack`

type column - if None - the whole location stack is taken.

- can be any slice of a
β event type. Must be a valid event_type (either parent or child).`event_type`

#### Returnsβ

bach DataFrame with results.

### top_product_features_before_conversionβ

(data, name, location_stack=None, event_type='InteractiveEvent')Calculates what users did before converting by combining several models from the model hub.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β label of the conversion event.`name`

β the location stack`location_stack`

- can be any slice of a
`modelhub.SeriesLocationStack`

type column - if None - the whole location stack is taken.

- can be any slice of a
β event type. Must be a valid event_type (either parent or child).`event_type`

#### Returnsβ

bach DataFrame with results.

### unique_sessionsβ

(data, groupby=NotSet.token)Calculate the unique sessions in the Objectiv `data`

.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β sets the column(s) to group by.`groupby`

- if not_set it defaults to using
`ModelHub.time_agg`

. - if None it aggregates over all data.

- if not_set it defaults to using

#### Returnsβ

series with results.

### unique_usersβ

(data, groupby=NotSet.token)Calculate the unique users in the Objectiv `data`

.

#### Parametersβ

β`data`

`bach.DataFrame`

to apply the method on.β sets the column(s) to group by.`groupby`

- if not_set it defaults to using
`ModelHub.time_agg`

. - if None it aggregates over all data.

- if not_set it defaults to using

#### Returnsβ

series with results.