How-to: Filter data in the report

Report filtering can be done using the ppc_robot_lib.reporting.transformation.filtering.filter_table(). This step expects a list of functions that accepts a pandas.DataFrame and returns a series of booleans that indicates which rows matches the given condition.

This series can be easily produced by taking a column of the DataFrame and comparing it to other column, or a single value:

def product_is_approved(df: pandas.DataFrame):
    return df['Status'] == 'Approved'

def has_impressions(df: pandas.DataFrame):
    return df['Impressions'] > 0

filtered_df = filter_table(df_to_filter, [
    product_is_approved,
    has_impressions,
])

For static filters, this is very simple. But if you would like to build the filters according to the user input, things get a little more complex.

The PPC Robot user interface usually gives you the filters in a list that looks like this:

[
    {"field": "Status", "op": "=", "value": "Approved"},
    {"field": "Impressions", "op": ">", "value": 0}
]

field is the table column to use, op is the operator used for comparison, value is the reference value to compare to. To process these filters, you have to dynamically create a filtering function for each row. A good idea is to start by creating a factory function that will create a factory for these functions:

Note

The following guide applies to you only if you are preparing a report for a new platform. If you are implementing an report for Sklik or AdWords, you can use a factory function that is already implemented and well-tested. See the last section of this guide for more details.

def create_filter_fn(field, op, value):
    def filter_fn(df: pandas.DataFrame):
        if op == '=':
            return df[field] == value
        elif op == '>':
            return df[field] > value
        elif op == '<':
            return df[field] < value
        # TODO: Implement additional operators such as !=, <=, >=, ...

    return filter_fn

You can use this factory to create your desired list of filter functions:

# filters are extracted from parameters and it might look like this:
filters = [
    {"field": "Status", "op": "=", "value": "Approved"},
    {"field": "Impressions", "op": ">", "value": 0}
]
#

filter_functions = [create_filter_fn(filter['field'], filter['op'], filter['value']) for filter in filters]
filtered_df = filter_table(df_to_filter, filter_functions)

It works! However, it is quite dull to list each supported operator as a new branch of if clause. We can help ourselves by creating a dict that will contain comparison functions that we can get and call:

OPERATOR_MAP = {
    '=': lambda lhs, rhs: lhs == rhs
    '>': lambda lhs, rhs: lhs > rhs
    '>=': lambda lhs, rhs: lhs >= rhs
    '<': lambda lhs, rhs: lhs < rhs
    '<=': lambda lhs, rhs: lhs <= rhs
    '!=': lambda lhs, rhs: lhs != rhs
}

def create_filter_fn(field, op, value):
    operator_fn = OPERATOR_MAP[op]

    def filter_fn(df: pandas.DataFrame):
        return operator_fn(df[field], value)

But hey, all these lambdas are quite repetitive. Luckily, the guys that created Python already prepared functions like these we can use in the operator module. The example will look like this:

import operator

OPERATOR_MAP = {
    '=': operator.eq
    '>': operator.gt
    '>=': operator.ge
    '<': operator.lt
    '<=': operator.le
    '!=': operator.ne
}

def create_filter_fn(field, op, value):
    operator_fn = OPERATOR_MAP[op]

    def filter_fn(df: pandas.DataFrame):
        return operator_fn(df[field], value)

If you need additional, prerhaps more obscure comparison operators, you can always use a lambda in your map. The user interface can work with any operators you prepare, so add as much operators as you like and you think that would be useful for the users.

Filtering in AdWords and Sklik

AdWords: filter function factory is named create_filter_function can be imported from the ppc_robot_lib.adwords.step_factory module:

from ppc_robot_lib.adwords.step_factory import create_filter_function

The operator parameter must be a member of the ppc_robot_lib.adwords.query.Op enumeration. The user interface will give you the parameter as string, so you have to convert it before you pass it to the factory:

from ppc_robot_lib.adwords.query import Op

filter_functions = [
    create_filter_function(filter['field'], Op(filter['op']), filter['value']) for filter in filters
]

Skik: is is almost the same, just import from the ppc_robot_lib.sklik package:

from ppc_robot_lib.sklik.step_factory import create_filter_function

The op must be a member of ppc_robot_lib.sklik.query.Op enum:

from ppc_robot_lib.sklik.query import Op

filter_functions = [
    create_filter_function(filter['field'], Op(filter['op']), filter['value']) for filter in filters
]