We've released v0.0.14. This release adds several new pandas-like date & time operations to the Bach modeling library that provide more granular control when wrangling/exploring time-related data.
Like any Bach operation, they work on the full SQL dataset.
A few highlights of the features that we’ve added:
TimeDelta: Get days/seconds/microseconds/components between two time series.
quantile: to return values at the given quantile of numeric columns.
unstack(): Reshape a DataFrame or Series having a multi-level index, pivoting its columns back and forth.
cut(): Bin values into discrete intervals, e.g. to segment into groups of age ranges.
qcut(): Bin values into equal-sized buckets based on rank or quantiles.
dropna(): drop rows that contain nullable values in DataFrame and Series.
nullgaps, with either a value, or
An example of the new
Modeling session duration distribution with quantiles in Bach
Other features for exploration and modeling that were added:
display_sql_markdown()support: to display SQL results as Markdown, making it easier to read in notebooks.
describe()for all values: generates descriptive statistics that summarize the shape of a dataset’s distribution, such as count, max, min, mean, and standard deviation.
- Variable time aggregation in models in model hub.
- Bach in our repo, for how to run the library.
- The Bach modeling library docs, for how to use all the new functionality.
- The full changelog.