Tag: Custom Loss

  • Custom Objective Function in XGBoost

    In the previous post, we covered how you can create a custom loss function in Catboost, but you might be using catboost, so how can you create the same if you’re using Xgboost to train your models. In this post, I’ll walk over an example using the famous Titanic dataset, where we’ll recreate the LogLoss function and compare the results with the standard implementation in the library.

    First, we have to set up the data.

    import numpy as np 
    import seaborn as sns
    import pandas as pd
    import xgboost as xgb
    from sklearn.metrics import log_loss

    data = sns.load_dataset('titanic')

    Then some data cleaning and setting up the training dataset. The goal is not to get the best model but to demonstrate the custom loss function, so not much feature engineering is being done.

    data['embarked'].fillna('S', inplace = True)

    X,y = data[[c for c in data.columns if c not in \
    ['survived', 'alive', 'deck', 'embark_town']]], \
    data['survived']

    cat_columns = ['pclass', 'sex', 'sibsp', 'parch', 'embarked', 'class',
    'who', 'adult_male', 'alone']

    X = pd.get_dummies(X, columns=cat_columns, drop_first=True)

    Let’s say there was no loss function like logloss, then how would you define the logloss as an objective function.

    LogLoss = -1/N \sum({y_{i}log(\hat{y}) + (1-y_{i})log(1-\hat{y})})

    You’ll have to calculate the first and second derivative with respect to the \hat{y}

    \Large \frac{\partial LogLoss}{\partial \hat{y}} = -\frac{y_{i}}{\hat{y}} + \frac{1-y_{i}}{1-\hat{y}}

    \Large  \frac{\partial^2LogLoss }{\partial \hat{y}^2} = \frac{y_{i}}{\hat{y}^{2}} + \frac{1-y_{i}}{(1-\hat{y})^{2}}

    Now we will write these up as Python functions and create a function that returns the gradient and hessian (second derivative) values. In the xgboost library, the first value being passed is the predictions and the second is the training matrix.

    def log_loss_derivative(y_pred, dtrain ):
    y = dtrain.get_label()
    return (-y/y_pred) + ((1-y)/(1-y_pred))

    def log_loss_second_derivative(y_pred, dtrain):
    y = dtrain.get_label()
    return (y/np.power(y_pred,2)) + ((1-y)/np.power((1-y_pred),2))

    def custom_log_loss(predt, dtrain):
    y_pred = np.clip(predt, a_max=1-1e-5, a_min=1e-5)
    grad = log_loss_derivative(y_pred= y_pred, dtrain = dtrain)
    hess = log_loss_second_derivative(y_pred= y_pred, dtrain = dtrain)
    return grad, hess

    We clip the predictions to avoid division by zero errors. Now let’s train.

    import xgboost as xgb

    dtrain =xgb.DMatrix(data=X, label=y)

    model = xgb.train({'tree_method': 'hist', 'seed': 1994},
    dtrain=dtrain,
    num_boost_round=10,
    obj=custom_log_loss)

    log_loss(y_pred=np.clip(model.predict(dtrain), a_max=1, a_min=0), y_true=y)
    >>>0.24912

    Comparison with the standard implementation.

    clf = xgb.XGBClassifier(n_estimators = 10, **{'tree_method': 'hist', 'seed': 1994})
    clf.fit(X,y)

    log_loss(y_pred=np.clip(clf.predict_proba(X)[:,1], a_max=1, a_min=0), y_true=y)

    >>>0.2861

    As we can see the metrics are very close in our implementation of the LogLoss and the standard implementation. Of course, you should use the standard implementation when it’s available, but in case you want to use a custom loss function, you now know how to do so.

  • Creating a Custom Loss Function For Machine Learning Models

    While standard Machine Learning Libraries provide a vast array of loss functions out of the box, sometimes we need to create our own custom loss function. In this blog post, I’ll go over a simple example and create a custom loss function in Catboost.

    First we will create the data for training.

    # Importing libraries
    import numpy as np
    import pandas as pd
    from sklearn.metrics import mean_squared_error
    from catboost import CatBoostRegressor, Pool
    from sklearn.datasets import fetch_california_housing

    raw_data = fetch_california_housing()

    data = pd.concat([pd.DataFrame(raw_data['data'], columns=raw_data['feature_names']),
    pd.Series(raw_data['target'], name = 'target')], axis = 1)

    features = [i for i in data.columns.tolist() if i != 'target']

    Since the objective is not to create the best model possible, we won’t be doing any feature engineering. Let’s use catboost, and create a model using standard loss functions.

    model = CatBoostRegressor(loss_function='RMSE', n_estimators=100, eval_metric='RMSE')

    cb_pool = Pool(data=data[features], label=data['target'], feature_names=features)

    model.fit(cb_pool)

    predictions = model.predict(cb_pool)

    mean_squared_error(y_true=data['target'], y_pred=predictions)

    Upon evaluating the model we find that the mean squared error is 0.15. Definitely a model which is overfitting, but that’s not a concern for this tutorial.

    But what is you don’t want to use RMSE as a loss function, and instead want to use something like this –

    loss = \frac{\sum (y - \hat{y})^{4}}{n}

    Then how do you create a loss function in catboost?

    For this, you’ll need to calculate the first derivative and the second derivative of the loss function with respect to \hat{y}.

    Using the chain rule, the first derivative is

    \frac{\partial (y-\hat{y})^4}{\partial \hat{y}} = \frac{\partial (y-\hat{y})^4}{\partial (y-\hat{y})}*\frac{\partial y - \hat{y}}{\partial \hat{y}} = 4 * (y - \hat{y})^{3}* -1 = -4(y -\hat{y})^{3}

    And similarly using the chain rule, the second derivative comes out to be 12*(y-\hat{y})^2

    The catboost template for a custom objective is as follows –

    class UserDefinedObjective(object):
        def calc_ders_range(self, approxes, targets, weights):
            """
            Computes first and second derivative of the loss function 
            with respect to the predicted value for each object.
    
            Parameters
            ----------
            approxes : indexed container of floats
                Current predictions for each object.
    
            targets : indexed container of floats
                Target values you provided with the dataset.
    
            weight : float, optional (default=None)
                Instance weight.
    
            Returns
            -------
                der1 : list-like object of float
                der2 : list-like object of float
    
            """
            pass
    

    Using this temple, we can write the custom objective –

    class CustomLossObjective(object):
    def calc_ders_range(self, approxes, targets, weights):
    assert len(approxes) == len(targets)
    if weights is not None:
    assert len(weights) == len(approxes)

    result = []
    n = len(targets) # Number of samples

    for index in range(len(targets)):
    error = targets[index] - approxes[index]
    der1 = -4 * error**3
    der2 = 12 * error**2

    if weights is not None:
    der1 *= weights[index]
    der2 *= weights[index]

    result.append((der1, der2))
    return result

    Now let’s use this custom loss in our model

    model = CatBoostRegressor(loss_function=CustomLossObjective(), n_estimators=100, eval_metric='RMSE')
    model.fit(cb_pool)

    predictions = model.predict(cb_pool)
    mean_squared_error(y_true=data['target'], y_pred=predictions)

    Using this loss, we see that the mean squared error is 0.735, this is clearly inferior to using RMSE, but as mentioned before the objective of this blog post is not to build the best model but to showcase how one can create a custom loss objective in catboost.