Series
class bach.Series
(engine, base_node, index, name, expression, group_by, instance_dtype, order_by=None, **kwargs)Series is an abstract class. An instance of Series represents a column of data. Specific subclasses are used to represent specific types of data and enable operations on that data.
It can be used as a separate object to just deal with a single list of values. There are many standard
operations on Series available to do operations like add or subtract, to create aggregations like
nunique()
or count()
, or to create new sub-Series, like unique()
.
Reference by function
Creation / re-framing
Series.to_frame () | Create a DataFrame with the index and data from this Series. |
Series.copy () | Return a copy of this Series. |
Value accessors
Series.head ([n]) | Get the first n rows from this Series as a pandas.Series. |
Series.to_pandas ([limit]) | Get the data from this series as a pandas.Series :param limit: The limit to apply, either as a max amount of rows or a slice. |
Series.array | .array property accessor akin pandas.Series.array |
Series.value | Retrieve the actual single value of this series. |
Attributes and underlying data
Axes
Series.name | Get this Series' name |
Series.index | Get this Series' index dictionary {name: Series} |
Series.group_by | Get this Series' group_by, if any. |
Series.order_by | Get the series expressions for sorting this Series. |
Types
Series.dtype | The dtype of this Series. |
Series.astype (dtype) | Convert this Series to another type. |
Sql Model
Series.base_node | Get this Series' base_node |
Series.materialize ([node_name, limit, ...]) | Create a copy of this Series with as base_node the current Series's state. |
Series.view_sql () |
Comparison and set operations
Series.all_values () | For every row in this Series, do multiple evaluations where all sub-evaluations should be True |
Series.any_value () | For every row in this Series, do multiple evaluations where any sub-evaluation should be True |
Series.exists () | Boolean operation that returns True if there are one or more values in this Series |
Series.isin (other) | Evaluate for every row in this series whether the value is contained in other |
Series.isnull () | Evaluate for every row in this series whether the value is missing or NULL. |
Series.notnull () | Evaluate for every row in this series whether the value is not missing or NULL. |
Conversion, reshaping, sorting
Series.reset_index ([level, drop]) | Drops the current index. |
Series.sort_index (*[, ascending]) | Sort this Series by its index. |
Series.sort_values (*[, ascending]) | Sort this Series by its values. |
Series.fillna (other) | Fill any NULL value with the given constant or other compatible Series |
Series.append (other[, ignore_index]) | Append rows of other series to the caller series. |
Series.drop_duplicates ([keep]) | Return a series with duplicated rows removed. |
Series.dropna () | Removes rows with missing values. |
Series.unstack ([level, fill_value, aggregation]) | Pivot a level of the index labels. |
Function application, aggregation & windowing
Series.agg (func[, group_by]) | Apply one or more aggregation functions to this Series. |
Series.aggregate (func[, group_by]) | Alias for agg() . |
Series.apply_func (func, *args, **kwargs) | Apply the given functions to this Series. |
Computations & descriptive stats
All types
Series.describe ([percentiles, ...]) | Returns descriptive statistics, it will vary based on what is provided |
Series.count ([partition, skipna]) | Returns the amount of rows in each partition or for all values if none is given. |
Series.min ([partition, skipna]) | Returns the minimum value in each partition or for all values if none is given. |
Series.max ([partition, skipna]) | Returns the maximum value in each partition or for all values if none is given. |
Series.median ([partition, skipna]) | Returns the median in each partition or for all values if none is given. |
Series.mode ([partition, skipna]) | Returns the mode in each partition or for all values if none is given. |
Series.nunique ([partition, skipna]) | Returns the amount of unique values in each partition or for all values if none is given. |
Series.value_counts ([normalize, sort, ...]) | Returns a series containing counts per unique value |
Window
Series.window_first_value ([window]) | Returns value evaluated at the row that is the first row of the window frame. |
Series.window_lag ([offset, default, window]) | Returns value evaluated at the row that is offset rows before the current row within the window |
Series.window_nth_value (n[, window]) | Returns value evaluated at the row that is the n'th row of the window frame. |
Series.window_lead ([offset, default, window]) | Returns value evaluated at the row that is offset rows after the current row within the window. |
Series.window_last_value ([window]) | Returns value evaluated at the row that is the last row of the window frame. |
Series.window_row_number ([window]) | Returns the number of the current row within its window, counting from 1. |
Series.window_rank ([window]) | Returns the rank of the current row, with gaps; that is, the row_number of the first row in its peer group. |
Series.window_dense_rank ([window]) | Returns the rank of the current row, without gaps; this function effectively counts peer groups. |
Series.window_percent_rank ([window]) | Returns the relative rank of the current row, that is (rank - 1) / (total partition rows - 1). |
Series.window_ntile ([num_buckets, window]) | Returns an integer ranging from 1 to the argument value, dividing the partition as equally as possible. |
Series.window_cume_dist ([window]) | Returns the cumulative distribution, that is (number of partition rows preceding or peers with current row) / (total partition rows). |