src.hersteller.data_processing.pipeline_operations.py

class ColumnDropper(target: list)

Bases: BaseEstimator, TransformerMixin

Drop the specified columns from the DataFrame.

Parameters:

target (list) – List of columns to drop

fit(target)

Return self.

transform(x: DataFrame) DataFrame

Drop the specified columns from the DataFrame.

Parameters:

x (pd.DataFrame) – The Dataframe to transform

Returns:

The transformed Dataframe

Return type:

pd.DataFrame

class ColumnTypeSetter(target: list)

Bases: BaseEstimator, TransformerMixin

Set the specified columns to type float in the DataFrame.

Parameters:

target (list) – List of columns to set

fit(target)

Return self.

transform(x: DataFrame) DataFrame

Set the specified columns to type float in the DataFrame.

Parameters:

x (pd.DataFrame) – The Dataframe to transform

Returns:

The transformed Dataframe

Return type:

pd.DataFrame

class OneHotEncodePd(target: str, prefix: str, sep: str, required_columns=None)

Bases: BaseEstimator, TransformerMixin

One-hot encode the specified column.

Parameters:
  • target (list) – The column to one-hot encode.

  • prefix (str) – The prefix to use for the one-hot encoded columns.

  • sep (str) – The separator to use for the one-hot encoded columns.

  • required_columns (list) – A list of columns that should be present in the DataFrame after one-hot encoding.

fit(target)

Return self.

transform(x: DataFrame) DataFrame

One-hot encode the specified column.

Parameters:

x (pd.DataFrame) – The Dataframe to transform

Returns:

The transformed Dataframe

Return type:

pd.DataFrame

class MultiOneHotEncodePd(target: str, prefix: str, sep: str, required_columns=None)

Bases: BaseEstimator, TransformerMixin

One-hot encode the specified column into multiple categorical values.

Parameters:
  • target (list) – The column to one-hot encode.

  • prefix (str) – The prefix to use for the one-hot encoded columns.

  • sep (str) – The separator to use for the one-hot encoded columns.

  • required_columns (list) – A list of columns that should be present in the DataFrame after one-hot encoding.

fit(target)

Return self.

transform(x: DataFrame) DataFrame

One-hot encode the specified column containing lists of categorical values.

Parameters:

x (pd.DataFrame) – The Dataframe to transform

Returns:

The transformed Dataframe

Return type:

pd.DataFrame

class NormalizeCols(target: str, feature_range: tuple, column_range: tuple)

Bases: BaseEstimator, TransformerMixin

Normalize the specified column to the specified feature range using provided column range.

Parameters:
  • target (str) – The column to normalize.

  • feature_range (tuple) – The desired range of the transformed data.

  • column_range (tuple) – The actual range of the column data.

fit(target)

Return self.

transform(x: DataFrame) DataFrame

Normalize the specified column to the specified feature range using provided column range.

Parameters:

x (pd.DataFrame) – The Dataframe to transform

Returns:

The transformed Dataframe

Return type:

pd.DataFrame