bach.Series
class bach.Series
(engine, base_node, index, name, expression, group_by, instance_dtype, order_by=None, **kwargs)Series is an abstract class. An instance of Series represents a column of data. Specific subclasses are used to represent specific types of data and enable operations on that data.
It can be used as a separate object to just deal with a single list of values. There are many standard
operations on Series available to do operations like add or subtract, to create aggregations like
nunique()
or count()
, or to create new sub-Series, like unique()
.
Methods
agg (func[, group_by]) | Apply one or more aggregation functions to this Series. |
aggregate (func[, group_by]) | Alias for agg() . |
all_values () | For every row in this Series, do multiple evaluations where all sub-evaluations should be True |
any_value () | For every row in this Series, do multiple evaluations where any sub-evaluation should be True |
append (other[, ignore_index]) | Append rows of other series to the caller series. |
apply_func (func, *args, **kwargs) | Apply the given functions to this Series. |
astype (dtype) | Convert this Series to another type. |
copy () | Return a copy of this Series. |
count ([partition, skipna]) | Returns the amount of rows in each partition or for all values if none is given. |
describe ([percentiles, datetime_is_numeric]) | Returns descriptive statistics, it will vary based on what is provided |
drop_duplicates ([keep]) | Return a series with duplicated rows removed. |
dropna () | Removes rows with missing values. |
exists () | Boolean operation that returns True if there are one or more values in this Series |
fillna (other) | Fill any NULL value with the given constant or other compatible Series |
from_value (base, value[, name, dtype]) | Create an instance of this class, that represents a column with the given value. |
get_db_dtype (dialect) | Give the database type used to store values of this Series, for the given database dialect. |
head ([n]) | Get the first n rows from this Series as a pandas.Series. |
isin (other) | Evaluate for every row in this series whether the value is contained in other |
isnull () | Evaluate for every row in this series whether the value is missing or NULL. |
materialize ([node_name, limit, distinct, ...]) | Create a copy of this Series with as base_node the current Series's state. |
max ([partition, skipna]) | Returns the maximum value in each partition or for all values if none is given. |
median ([partition, skipna]) | Returns the median in each partition or for all values if none is given. |
min ([partition, skipna]) | Returns the minimum value in each partition or for all values if none is given. |
mode ([partition, skipna]) | Returns the mode in each partition or for all values if none is given. |
notnull () | Evaluate for every row in this series whether the value is not missing or NULL. |
nunique ([partition, skipna]) | Returns the amount of unique values in each partition or for all values if none is given. |
reset_index ([level, drop]) | Drops the current index. |
sort_by_series (by, *[, ascending]) | Sort this Series by other Series that have the same base node as this Series. |
sort_index (*[, ascending]) | Sort this Series by its index. |
sort_values (*[, ascending]) | Sort this Series by its values. |
to_frame () | Create a DataFrame with the index and data from this Series. |
to_numpy () | Return a Numpy representation of the Series akin pandas.Series.to_numpy |
to_pandas ([limit]) | Get the data from this series as a pandas.Series :param limit: The limit to apply, either as a max amount of rows or a slice. |
unique ([partition, skipna]) | Return all unique values in this Series. |
unstack ([level, fill_value, aggregation]) | Pivot a level of the index labels. |
value_counts ([normalize, sort, ascending, ...]) | Returns a series containing counts per unique value |
view_sql () | |
window_cume_dist ([window]) | Returns the cumulative distribution, that is (number of partition rows preceding or peers with current row) / (total partition rows). |
window_dense_rank ([window]) | Returns the rank of the current row, without gaps; this function effectively counts peer groups. |
window_first_value ([window]) | Returns value evaluated at the row that is the first row of the window frame. |
window_lag ([offset, default, window]) | Returns value evaluated at the row that is offset rows before the current row within the window |
window_last_value ([window]) | Returns value evaluated at the row that is the last row of the window frame. |
window_lead ([offset, default, window]) | Returns value evaluated at the row that is offset rows after the current row within the window. |
window_nth_value (n[, window]) | Returns value evaluated at the row that is the n'th row of the window frame. |
window_ntile ([num_buckets, window]) | Returns an integer ranging from 1 to the argument value, dividing the partition as equally as possible. |
window_percent_rank ([window]) | Returns the relative rank of the current row, that is (rank - 1) / (total partition rows - 1). |
window_rank ([window]) | Returns the rank of the current row, with gaps; that is, the row_number of the first row in its peer group. |
window_row_number ([window]) | Returns the number of the current row within its window, counting from 1. |
Attributes
array | .array property accessor akin pandas.Series.array |
base_node | Get this Series' base_node |
group_by | Get this Series' group_by, if any. |
index | Get this Series' index dictionary {name: Series} |
instance_dtype | Get the instance_dtype. |
is_materialized | Return true if this Series is in a materialized state, i.e. all information about the Series's values is encoded in self.base_node. |
name | Get this Series' name |
order_by | Get the series expressions for sorting this Series. |
value | Retrieve the actual single value of this series. |