bach.Series

class bach.Series

(engine, base_node, index, name, expression, group_by, instance_dtype, order_by=None, \*\*kwargs)

[source]

Series is an abstract class. An instance of Series represents a column of data. Specific subclasses are used to represent specific types of data and enable operations on that data.

It can be used as a separate object to just deal with a single list of values. There are many standard operations on Series available to do operations like add or subtract, to create aggregations like nunique() or count(), or to create new sub-Series, like unique().

Methods

agg(func[, group_by])Apply one or more aggregation functions to this Series.
aggregate(func[, group_by])Alias for agg().
all_values()For every row in this Series, do multiple evaluations where all sub-evaluations should be True
any_value()For every row in this Series, do multiple evaluations where any sub-evaluation should be True
append(other[, ignore_index])Append rows of other series to the caller series.
apply_func(func, *args, **kwargs)Apply the given functions to this Series.
astype(dtype)Convert this Series to another type.
copy()Return a copy of this Series.
count([partition, skipna])Returns the amount of rows in each partition or for all values if none is given.
describe([percentiles, datetime_is_numeric])Returns descriptive statistics, it will vary based on what is provided
drop_duplicates([keep])Return a series with duplicated rows removed.
dropna()Removes rows with missing values.
exists()Boolean operation that returns True if there are one or more values in this Series
fillna(other)Fill any NULL value with the given constant or other compatible Series
from_value(base, value[, name, dtype])Create an instance of this class, that represents a column with the given value.
get_db_dtype(dialect)Give the database type used to store values of this Series, for the given database dialect.
head([n])Get the first n rows from this Series as a pandas.Series.
isin(other)Evaluate for every row in this series whether the value is contained in other
isnull()Evaluate for every row in this series whether the value is missing or NULL.
materialize([node_name, limit, distinct, ...])Create a copy of this Series with as base_node the current Series's state.
max([partition, skipna])Returns the maximum value in each partition or for all values if none is given.
median([partition, skipna])Returns the median in each partition or for all values if none is given.
min([partition, skipna])Returns the minimum value in each partition or for all values if none is given.
mode([partition, skipna])Returns the mode in each partition or for all values if none is given.
notnull()Evaluate for every row in this series whether the value is not missing or NULL.
nunique([partition, skipna])Returns the amount of unique values in each partition or for all values if none is given.
reset_index([level, drop])Drops the current index.
sort_by_series(by, *[, ascending])Sort this Series by other Series that have the same base node as this Series.
sort_index(*[, ascending])Sort this Series by its index.
sort_values(*[, ascending])Sort this Series by its values.
to_frame()Create a DataFrame with the index and data from this Series.
to_numpy()Return a Numpy representation of the Series akin pandas.Series.to_numpy
to_pandas([limit])Get the data from this series as a pandas.Series :param limit: The limit to apply, either as a max amount of rows or a slice.
unique([partition, skipna])Return all unique values in this Series.
unstack([level, fill_value, aggregation])Pivot a level of the index labels.
value_counts([normalize, sort, ascending, ...])Returns a series containing counts per unique value
view_sql()
window_cume_dist([window])Returns the cumulative distribution, that is (number of partition rows preceding or peers with current row) / (total partition rows).
window_dense_rank([window])Returns the rank of the current row, without gaps; this function effectively counts peer groups.
window_first_value([window])Returns value evaluated at the row that is the first row of the window frame.
window_lag([offset, default, window])Returns value evaluated at the row that is offset rows before the current row within the window
window_last_value([window])Returns value evaluated at the row that is the last row of the window frame.
window_lead([offset, default, window])Returns value evaluated at the row that is offset rows after the current row within the window.
window_nth_value(n[, window])Returns value evaluated at the row that is the n'th row of the window frame.
window_ntile([num_buckets, window])Returns an integer ranging from 1 to the argument value, dividing the partition as equally as possible.
window_percent_rank([window])Returns the relative rank of the current row, that is (rank - 1) / (total partition rows - 1).
window_rank([window])Returns the rank of the current row, with gaps; that is, the row_number of the first row in its peer group.
window_row_number([window])Returns the number of the current row within its window, counting from 1.

Attributes

array.array property accessor akin pandas.Series.array
base_nodeGet this Series' base_node
group_byGet this Series' group_by, if any.
indexGet this Series' index dictionary {name: Series}
instance_dtypeGet the instance_dtype.
is_materializedReturn true if this Series is in a materialized state, i.e. all information about the Series's values is encoded in self.base_node.
nameGet this Series' name
order_byGet the series expressions for sorting this Series.
valueRetrieve the actual single value of this series.