dfchain.core.executor package

Submodules

dfchain.core.executor.groupbylike module

class dfchain.core.executor.groupbylike.GroupByLike(*args, **kwargs)[source]

Bases: Protocol

GroupByLike methods are compatible with pandas:

def groupby(
    self,
    by=None,
    axis: Axis | lib.NoDefault = lib.no_default,
    level: IndexLabel | None = None,
    as_index: bool = True,
    sort: bool = True,
    group_keys: bool = True,
    observed: bool | lib.NoDefault = lib.no_default,
    dropna: bool = True,
)

dfchain.core.executor.partitionable module

class dfchain.core.executor.partitionable.PartitionAble(_groupkey: collections.abc.Hashable | None = None)[source]

Bases: ABC

abstractmethod clear_groups() → None[source]: Clear any cached grouping state maintained by the executor.

get_groupkey() → Hashable[source]: Accessor method

property groupby: GroupByLike

Return a groupby object for the wrapped dataframe.

This is a thin wrapper around _groupby() to keep the public API backend‑agnostic while allowing implementations to choose the concrete groupby type.

abstractmethod iter_chunks() → Iterable[DataFrameLike][source]

Iterate over the dataframe in chunks.

The chunking strategy (by row count, partition, etc.) is left to the concrete implementation.

abstractmethod iter_groups() → Iterable[tuple[Hashable, DataFrameLike]][source]: Iterate over grouped data as (key, group_df) pairs.

abstractmethod rebuild_groups(flush_every: int = 1) → None[source]

abstractmethod update_group(df: DataFrameLike) → None[source]: Update the current group with the provided dataframe.

abstractmethod write_chunk(key: int, val: DataFrameLike)[source]

dfchain.core.executor package

Submodules

dfchain.core.executor.groupbylike module

dfchain.core.executor.partitionable module

Module contents