dfchain.core.executor package

Submodules

dfchain.core.executor.groupbylike module

class dfchain.core.executor.groupbylike.GroupByLike(*args, **kwargs)[source]

Bases: Protocol

GroupByLike methods are compatible with pandas:

def groupby(
    self,
    by=None,
    axis: Axis | lib.NoDefault = lib.no_default,
    level: IndexLabel | None = None,
    as_index: bool = True,
    sort: bool = True,
    group_keys: bool = True,
    observed: bool | lib.NoDefault = lib.no_default,
    dropna: bool = True,
)

dfchain.core.executor.partitionable module

class dfchain.core.executor.partitionable.PartitionAble(_groupkey: collections.abc.Hashable | None = None)[source]

Bases: ABC

abstractmethod clear_groups() None[source]

Clear any cached grouping state maintained by the executor.

get_groupkey() Hashable[source]

Accessor method

property groupby: GroupByLike

Return a groupby object for the wrapped dataframe.

This is a thin wrapper around _groupby() to keep the public API backend‑agnostic while allowing implementations to choose the concrete groupby type.

abstractmethod iter_chunks() Iterable[DataFrameLike][source]

Iterate over the dataframe in chunks.

The chunking strategy (by row count, partition, etc.) is left to the concrete implementation.

abstractmethod iter_groups() Iterable[tuple[Hashable, DataFrameLike]][source]

Iterate over grouped data as (key, group_df) pairs.

abstractmethod rebuild_groups(flush_every: int = 1) None[source]
abstractmethod update_group(df: DataFrameLike) None[source]

Update the current group with the provided dataframe.

abstractmethod write_chunk(key: int, val: DataFrameLike)[source]

Module contents