modelhub.FunnelDiscovery.get_navigation_paths
get_navigation_pathsβ
(data, steps, by=NotSet.token, location_stack=None, add_conversion_step_column=False, only_converted_paths=False, start_from_end=False, n_examples=None, sort_by=None)β[source]
Get the navigation paths for each eventβs location stack. Each navigation path is represented as a row, where each step is defined by the nice name of the considered location.
For each location stack:
The number of navigation paths to be generated is less than or equal to
steps.The locations to be considered as starting steps are those that have an offset between 0 and
steps - 1in the location stack.For each path, the rest of steps are defined by the
steps - 1locations that follow the start location in the location stack.
For example, having location_stack = [βaβ, βbβ, βcβ , βdβ] and steps = 3
will generate the following paths:
βaβ, βbβ, βcββbβ, βcβ, βdββcβ, βdβ, None
Parametersβ
data(bach.dataframe.DataFrame) βbach.DataFrameto apply the method on.steps(int) β Number of steps/locations to consider in navigation path.by(Union[List[Union[str, bach.series.series.Series]], str, bach.series.series.Series, sql_models.constants.NotSet]) β sets the column(s) to group by. If by is None or not set, then steps are based on the order of events based on the entire dataset.location_stack(Union[str, SeriesString, SeriesLocationStack, SeriesInt64]) β the column of which to create the paths. Can be a string of the name of the column in data, or a Series with the same base node asdata. If None the default location stack is taken.add_conversion_step_column(bool) β if True gets the first conversion step number per each navigation path and adds it as a column to the returned dataframe.only_converted_paths(bool) β if True filters each navigation path to first conversion location.start_from_end(bool) β if True starts the construction of navigation paths from the last context from the stack, otherwise it starts from the first. If there are too many steps, and we limit the amount withn_examplesparameter we can lose the last steps of the user, hence in order to βprioritizeβ the last steps one can use this parameter. Havinglocation_stack = [βaβ, βbβ, βcβ , βdβ]andsteps= 3 will generate the following paths:βbβ, βcβ, βdββaβ, βbβ, βcβNone, βaβ, βbβ
n_examples(int) β limit the amount of navigation paths. IfNone, all the navigation paths are taken.sort_by(str) β column to sort by for determining the order of the sequences of βlocation_stackβ
Returnsβ
Bach DataFrame containing a new Series for each step containing the nice name of the location.