In the previous post, we covered how you can create a custom loss function in Catboost, but you might be using catboost, so how can you create the same if you’re using Xgboost to train your models. In this post, I’ll walk over an example using the famous Titanic dataset, where we’ll recreate the LogLoss function and compare the results with the standard implementation in the library.
First, we have to set up the data.
import numpy as np
import seaborn as sns
import pandas as pd
import xgboost as xgb
from sklearn.metrics import log_loss
data = sns.load_dataset('titanic')
Then some data cleaning and setting up the training dataset. The goal is not to get the best model but to demonstrate the custom loss function, so not much feature engineering is being done.
data['embarked'].fillna('S', inplace = True)
X,y = data[[c for c in data.columns if c not in \
['survived', 'alive', 'deck', 'embark_town']]], \
data['survived']
cat_columns = ['pclass', 'sex', 'sibsp', 'parch', 'embarked', 'class',
'who', 'adult_male', 'alone']
X = pd.get_dummies(X, columns=cat_columns, drop_first=True)
Let’s say there was no loss function like logloss, then how would you define the logloss as an objective function.
You’ll have to calculate the first and second derivative with respect to the
Now we will write these up as Python functions and create a function that returns the gradient and hessian (second derivative) values. In the xgboost library, the first value being passed is the predictions and the second is the training matrix.
def log_loss_derivative(y_pred, dtrain ):
y = dtrain.get_label()
return (-y/y_pred) + ((1-y)/(1-y_pred))
def log_loss_second_derivative(y_pred, dtrain):
y = dtrain.get_label()
return (y/np.power(y_pred,2)) + ((1-y)/np.power((1-y_pred),2))
def custom_log_loss(predt, dtrain):
y_pred = np.clip(predt, a_max=1-1e-5, a_min=1e-5)
grad = log_loss_derivative(y_pred= y_pred, dtrain = dtrain)
hess = log_loss_second_derivative(y_pred= y_pred, dtrain = dtrain)
return grad, hess
We clip the predictions to avoid division by zero errors. Now let’s train.
import xgboost as xgb
dtrain =xgb.DMatrix(data=X, label=y)
model = xgb.train({'tree_method': 'hist', 'seed': 1994},
dtrain=dtrain,
num_boost_round=10,
obj=custom_log_loss)
log_loss(y_pred=np.clip(model.predict(dtrain), a_max=1, a_min=0), y_true=y)
>>>0.24912
Comparison with the standard implementation.
clf = xgb.XGBClassifier(n_estimators = 10, **{'tree_method': 'hist', 'seed': 1994})
clf.fit(X,y)
log_loss(y_pred=np.clip(clf.predict_proba(X)[:,1], a_max=1, a_min=0), y_true=y)
>>>0.2861
As we can see the metrics are very close in our implementation of the LogLoss and the standard implementation. Of course, you should use the standard implementation when it’s available, but in case you want to use a custom loss function, you now know how to do so.


