Series
Bases: _SeriesCoreMixin, _SeriesSummaryMixin, Series
flowchart TD
metaframe.src.series.base.Series[Series]
metaframe.src.series.core._SeriesCoreMixin[_SeriesCoreMixin]
metaframe.src.series.summary._SeriesSummaryMixin[_SeriesSummaryMixin]
metaframe.src.series.core._SeriesCoreMixin --> metaframe.src.series.base.Series
metaframe.src.series.summary._SeriesSummaryMixin --> metaframe.src.series.base.Series
click metaframe.src.series.base.Series href "" "metaframe.src.series.base.Series"
click metaframe.src.series.core._SeriesCoreMixin href "" "metaframe.src.series.core._SeriesCoreMixin"
click metaframe.src.series.summary._SeriesSummaryMixin href "" "metaframe.src.series.summary._SeriesSummaryMixin"
Extended pandas Series with dataframe-aware helpers and summaries.
This subclass behaves like pandas.Series but guarantees that
operations returning new objects preserve the custom Series or
project DataFrame types. It also provides additional helpers for:
- construction from DataFrames or Index objects
- regex matching
- structured statistical summaries
Source code in metaframe/src/series/base.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
_constructor
property
Series constructor used internally by pandas.
Ensures pandas operations that produce a new Series return this
subclass instead of pandas.Series.
Returns:
| Type | Description |
|---|---|
Series
|
|
_constructor_expanddim
property
DataFrame constructor used when dimensionality increases.
Used internally when a Series becomes a DataFrame.
Returns:
| Type | Description |
|---|---|
DataFrame
|
|
summary(**kwargs)
Compute a structured summary of the Series.
Produces a MultiIndex Series describing counts, missing values, descriptive statistics, value frequencies, and optional custom metrics.
The resulting Series will have a MultiIndex with the following levels:
- dtype -> 'all', 'Not Numeric' or 'Numeric' Describe on which dtype of data from the original Series the summary was produced
- Mode
-> 'Describe', 'Value Count' or 'Custom'
Describe which source produced the summary
- Describe: pandas describe method
- Value Count: pandas value_counts method
- Custom: user-defined function from d_func parameter
- Metric Name of the computed metric displayed
- Type Type of metric Each 'count' metric will generate its associated % metric type below
Behavior depends on value_counts:
- None -> automatically split numeric and non-numeric data
- True -> frequency-based summary only (on numeric and non-numeric data)
- False -> descriptive statistics only (on numeric data)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kwargs
|
SeriesSummaryOpts keywords arguments. |
{}
|
Returns:
| Type | Description |
|---|---|
Series
|
MultiIndex Series containing the summary statistics. |
Examples:
>>> s = Series([1, 2, 'a', 2, 'b', 'a', 3, 'a', None])
>>> s.summary()
dtype Mode Metric Type
all Describe Num. elements count 9
NAs count 1.0
% 11.1
Not Numeric Describe Num. elements count 4.0
% 44.4
unique count 2
% 50.0
top top a
freq freq 3
Value Count a count 3
% 75.0
b count 1
% 25.0
Numeric Describe Num. elements count 4.0
% 44.4
mean mean 2.0
std std 0.82
min min 1.0
25% percentile 1.75
50% percentile 2.0
75% percentile 2.25
max max 3.0
sum sum 8.0
Custom zeros count 0.0
% 0.0
filled count 4.0
% 100.0
dtype: object
Source code in metaframe/src/series/summary.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | |
_summary_not_num(l_summary_names, opts)
Compute summary statistics for non-numeric data.
Includes descriptive metrics and value frequencies, with optional percentage computation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
l_summary_names
|
List[str]
|
|
required |
opts
|
SeriesSummaryOpts
|
|
required |
Returns:
| Type | Description |
|---|---|
Series
|
MultiIndex summary for non-numeric values. |
Source code in metaframe/src/series/summary.py
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 | |
_summary_num(l_summary_names, summary_mi, opts)
Compute summary statistics for numeric data.
Includes descriptive statistics (mean, std, min, percentiles, max, sum)
and optional custom metrics provided through d_func.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
l_summary_names
|
List[str]
|
|
required |
summary_mi
|
MultiIndex
|
|
required |
opts
|
SeriesSummaryOpts
|
|
required |
Returns:
| Type | Description |
|---|---|
Series
|
MultiIndex summary for numeric values. |
Source code in metaframe/src/series/summary.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
fullmatch(pattern, **kwargs)
Test whether each value fully matches a regex pattern.
Each value is cast to string and matched using re.fullmatch.
Missing values return False.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern
|
str
|
Regular expression pattern. |
required |
**kwargs
|
Additional arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
Series of bool
|
|
Examples:
>>> s = Series(["A1", "B2", "AA"])
>>> s.fullmatch(r"[A-Z]\d")
0 True
1 True
2 False
dtype: bool
Source code in metaframe/src/series/core.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | |
to_int(start_at=0)
Encode unique values as consecutive integers.
Identical values receive identical integers. Missing values are preserved.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start_at
|
int
|
Starting integer label. |
0
|
Returns:
| Type | Description |
|---|---|
Series of int
|
|
Examples:
>>> s = Series(["a", "b", "a"])
>>> s.to_int()
0 0
1 1
2 0
dtype: int64
Source code in metaframe/src/series/core.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |