# How to Build a C++ Model in a Python Machine Learning Project

##### March 3, 2022

- Topics:
- Machine Learning

Python is quite versatile when building Machine Learning models. This is due to the large community, many libraries, as well as short and easy-to-understand code. However, it has a disadvantage when it comes to execution speed. This is where a high-speed language like C++ comes in.

Though we can build a fast ML model using C++, it’s no match to Python when it comes to the number of Machine Learning libraries. Nevertheless, we can utilize Python libraries such as *Numpy* and *Pandas* for data preprocessing and then build a model running on C++.

Python has the *ctypes* module that allows us to call C++ code and use it in our program. In this article, we are going to harness ctypes’ capabilities and create an ML model. We will build a *Logistic Regression model* and then optimize it using *Gradient Descent*. The main aim of this article is to guide you on how you can build your custom model using C++.

### Prerequisites

This is a bit of advanced-level content. Therefore, a solid understanding of the following languages is required:

*C++*- You should have some knowledge of pointers, data structures like vectors, and object-oriented programming semantics.*Python*- You should be familiar with its tooling and ecosystem.- Machine Learning concepts.

You also need to approach this tutorial with a research-oriented mindset. This is a required skill for a data scientist.

### Overview

We will start by briefly looking at what Logistic regression entails. Next, we will discuss the Gradient descent optimization algorithm.

Thereafter, we will write the C++ code. Finally, we will build the C++ file as a shared library and consume it in Python using the *ctypes* module.

Let’s get started!

### Logistic regression

This is a classification algorithm used in supervised learning. Its main aim is to show the probability that an instance belongs to a certain class under target. It does so by calculating the sum of the features multiplied by their weights plus the bias term.

To perform the prediction, the sum is passed into a sigmoid function, as shown in the equation below:

A cost function (*log loss*) is used when the model outputs a very high probability for a positive instance and a very lower one for a negative instance.

The cost for the whole training set is the average of all the instances’ costs. The cost of an instance is done by calculating the prediction error i.e the prediction value - the actual value.

We can optimize the cost function using any optimization algorithm such as gradient descent since it is convex. To do that, we have to get the derivative of the *log loss*. This is done using partial derivatives:

If you want to look at how this function is derived, have a look at this article.

We will look at this function in detail later in the C++ code. Find out more about Logistic Regression here.

### Gradient descent (GD) algorithm

It minimizes a cost function by repeatedly updating its parameters (weight and bias) until convergence is reached.

GD calculates the gradient of the error function and moves along a descending gradient until a minimum is reached. Have a look at the pseudocode below:

```
weight = 0
bias = 0
update until minimum:
weight = weight - (learning rate × (weight gradient))
bias = bias - (learning rate × (bias gradient))
```

For logistic regression, the gradient of the bias is calculated by simply finding the derivative of the log loss while that of the weights is gotten by multiplying the log loss derivative by the feature weight.

The learning rate is used to control the number of iterations until convergence. More on GD can be found here.

Let’s now look at the C++ code.

### C++ code

We will break the project into small sections before showing the full code.

The first step is to import the required modules and the `std`

namespace:

```
#include<iostream>
#include <math.h>
#include <vector>
using namespace std;
```

A class with the method signatures is created as follows:

```
class CPPLogisticRegression{
public:
//method for updating the weights and bias
vector<double> updateWeightsAndBias(int noOfIterations, int noOfRows, int noOfColumns);
//method for the prediction
double predict(vector<double> vW, double* X_train_test);
};
```

#### Updating weights and the bias term

Next, we dissect the method for updating the weights and the bias term:

```
vector<double> CPPLogisticRegression::updateWeightsAndBias(int noOfIterations, int noOfRows, int noOfColumns){
double row_pred_diff = 0.0;
double total_diff = 0.0;
double feature_weight[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double total_feature_weight[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double weight_derivative[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double bias_derivative = 0.0;
double W[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double bias = 0.0;
vector<double> vWB;
//train set
double X_train[noOfRows][noOfColumns] = {
{57.0,0.0,0.0,140.0,241.0,0.0,1.0,123.0,1.0,0.2,1.0,0.0,3.0},
{45.0,1.0,3.0,110.0,264.0,0.0,1.0,2.0,0.0,1.2,1.0,0.0,3.0},
{68.0,1.0,0.0,144.0,13.0,1.0,1.0,141.0,0.0,3.4,1.0,2.0,3.0},
{57.0,1.0,0.0,80.0,1.0,0.0,1.0,115.0,1.0,1.2,1.0,1.0,3.0},
{57.0,0.0,1.0,0.0,236.0,0.0,0.0,174.0,0.0,0.0,1.0,1.0,2.0},
{61.0,1.0,0.0,140.0,207.0,0.0,0.0,8.0,1.0,1.4,2.0,1.0,3.0},
{46.0,1.0,0.0,140.0,311.0,0.0,1.0,120.0,1.0,1.8,1.0,2.0,3.0},
{62.0,1.0,1.0,128.0,208.0,1.0,0.0,140.0,0.0,0.0,2.0,0.0,2.0},
{62.0,1.0,1.0,128.0,208.0,1.0,0.0,140.0,0.0,0.0,2.0,0.0,2.0}};
//labels
double Y[noOfRows] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0};
for (int l = 0; l < noOfIterations; l++){
for (int i = 0; i < noOfRows; i++){
double Wx = 0.0;
//computing W.x
for (int j = 0; j < noOfColumns; j++){
Wx = W[j] * X_train[i][j];
}
//computing (σ(W.x) + b) - Y
row_pred_diff = (1/(1 + exp(-(Wx+bias))))-Y[i];
for (int k = 0; k < noOfColumns; k++){
//computing (σ(W.x) + b) - Y × x(i)
feature_weight[k] = row_pred_diff * X_train[i][k];
//summation(Σ) of each feature weight
total_feature_weight[k] += feature_weight[k];
}
//summation(Σ) of predictions
total_diff += row_pred_diff;
}
//updating the weights for each feature
for (int z = 0; z < noOfColumns; z++){
//computing the average of the weights(1/m)
weight_derivative[z] = total_feature_weight[z]/noOfRows;
W[z] = W[z] - 0.1 * weight_derivative[z];
//storing the values in a vector
vWB.push_back(W[z]);
}
//calculating the bias
bias_derivative = total_diff/noOfRows;
bias = bias - 0.1 * bias_derivative;
vWB.push_back(bias);
}
return vWB;
}
```

We need to appropriately initialize the *arrays*. Next, we create a *for-loop* with two inner loops.

In the first inner loop, we have two inner for-loops used to compute the weighted *sum(W.x)* and another one to compute a summation of each *feature weight*.

In the end, we calculate the *summation(Σ)* of *predictions(costs)* of each instance.

```
for (int i = 0; i < noOfRows; i++){
double Wx = 0.0;
//computing W.x
for (int j = 0; j < noOfColumns; j++){
Wx = W[j] * X_train[i][j];
}
//computing (σ(W.x) + b) - Y
row_pred_diff = (1/(1 + exp(-(Wx+bias))))-Y[i];
for (int k = 0; k < noOfColumns; k++){
//computing (σ(W.x) + b) - Y × x(i)
feature_weight[k] = row_pred_diff * X_train[i][k];
//summation(Σ) of each feature weight
total_feature_weight[k] += feature_weight[k];
}
//summation(Σ) of predictions
total_diff += row_pred_diff;
}
```

In the second inner loop, we compute the weights of each feature by computing the average of the total *feature weights* and then updating them.

The weights are then stored in a vector (0.1 is the learning rate).

```
for (int z = 0; z < noOfColumns; z++){
//computing the average of the weights(1/m)
weight_derivative[z] = total_feature_weight[z]/noOfRows;
W[z] = W[z] - 0.1 * weight_derivative[z];
//storing the values in a vector
vWB.push_back(W[z]);
}
```

The last step in the outer loop is updating the *bias* term and storing it as the last item in a vector.

We stored the weights and the bias in one vector since C++ does not allow returning more than one value from a method/function like Python.

```
//calculating the bias
bias_derivative = total_diff/noOfRows;
bias = bias - 0.1 * bias_derivative;
vWB.push_back(bias);
```

The function returns the *vector* containing the *weights* and the *bias* term.

#### Prediction

The vector we returned from the previous function is passed into this function together with an *array* of test features.

We calculate the weighted sum as we did in the previous function then calculate the *sigmoid* to get a probability.

The accuracy is quite low since we only have a few test features.

```
double CPPLogisticRegression::predict(vector<double> vW, double* X_train_test){
static double predictions;
double Wx_test = 0.0;
//calculating the σ(W.x)
for (int j = 0; j < 13; j++){
Wx_test += (vW[j] * X_train_test[j]);
}
//adding the bias term
predictions = 1/(1 + exp(-(Wx_test + vW.back())));
//making the prediction
if(predictions>0.5){
predictions = 1.0;
}else{
predictions = 0.0;
}
return predictions;
}
```

We use the `extern C`

statement to write functions that will be accessible outside the C++ code.

These are functions that we will call in the Python code. For Windows, you will append the literal `__declspec(dllexport)`

before the functions ie:

```
__declspec(dllexport) CPPLogisticRegression* LogisticRegression(){
//......
}
```

You can read more about *ctypes* from this official documentation.

```
extern "C"{
//vector to store the weights and bias gotten from the updateWeightsAndBias() function
vector<double> vX;
CPPLogisticRegression* LogisticRegression(){
CPPLogisticRegression* log_reg = new CPPLogisticRegression();
return log_reg;
}
void fit(CPPLogisticRegression* log_reg) {
vX = log_reg->updateWeightsAndBias(50,9,13);
}
double predict(CPPLogisticRegression* log_reg, double* array){
return log_reg->predict(vX,array);
}
}
```

In the code above, the `LogisticRegression()`

function instantiates the class we created and returns it.

The `fit()`

function calls the method for updating the *weights* and the *bias* term. It returns a vector which is later passed to the class’ `predict()`

function inside the `predict()`

function.

Note the difference between the two similarly named prediction functions. The array for the predict function will be passed in from Python.

Here is the full C++ code:

```
#include<iostream>
#include <math.h>
#include <vector>
using namespace std;
class CPPLogisticRegression{
public:
//method for updating the weights and bias
vector<double> updateWeightsAndBias(int noOfIterations, int noOfRows, int noOfColumns);
//method for the prediction
double predict(vector<double> vW, double* X_train_test);
};
vector<double> CPPLogisticRegression::updateWeightsAndBias(int noOfIterations, int noOfRows, int noOfColumns){
double row_pred_diff = 0.0;
double total_diff = 0.0;
double feature_weight[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double total_feature_weight[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double weight_derivative[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double bias_derivative = 0.0;
double W[noOfColumns] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0};
double bias = 0.0;
vector<double> vWB;
//train set
double X_train[noOfRows][noOfColumns] = {
{57.0,0.0,0.0,140.0,241.0,0.0,1.0,123.0,1.0,0.2,1.0,0.0,3.0},
{45.0,1.0,3.0,110.0,264.0,0.0,1.0,2.0,0.0,1.2,1.0,0.0,3.0},
{68.0,1.0,0.0,144.0,13.0,1.0,1.0,141.0,0.0,3.4,1.0,2.0,3.0},
{57.0,1.0,0.0,80.0,1.0,0.0,1.0,115.0,1.0,1.2,1.0,1.0,3.0},
{57.0,0.0,1.0,0.0,236.0,0.0,0.0,174.0,0.0,0.0,1.0,1.0,2.0},
{61.0,1.0,0.0,140.0,207.0,0.0,0.0,8.0,1.0,1.4,2.0,1.0,3.0},
{46.0,1.0,0.0,140.0,311.0,0.0,1.0,120.0,1.0,1.8,1.0,2.0,3.0},
{62.0,1.0,1.0,128.0,208.0,1.0,0.0,140.0,0.0,0.0,2.0,0.0,2.0},
{62.0,1.0,1.0,128.0,208.0,1.0,0.0,140.0,0.0,0.0,2.0,0.0,2.0}};
//labels
double Y[noOfRows] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0};
for (int l = 0; l < noOfIterations; l++){
for (int i = 0; i < noOfRows; i++){
double Wx = 0.0;
//computing W.x
for (int j = 0; j < noOfColumns; j++){
Wx = W[j] * X_train[i][j];
}
//computing (σ(W.x) + b) - Y
row_pred_diff = (1/(1 + exp(-(Wx+bias))))-Y[i];
for (int k = 0; k < noOfColumns; k++){
//computing (σ(W.x) + b) - Y × x(i)
feature_weight[k] = row_pred_diff * X_train[i][k];
//summation(Σ) of each feature weight
total_feature_weight[k] += feature_weight[k];
}
//summation(Σ) of predictions
total_diff += row_pred_diff;
}
//updating the weights for each feature
for (int z = 0; z < noOfColumns; z++){
//computing the average of the weights(1/m)
weight_derivative[z] = total_feature_weight[z]/noOfRows;
W[z] = W[z] - 0.1 * weight_derivative[z];
//storing the values in a vector
vWB.push_back(W[z]);
}
//calculating the bias
bias_derivative = total_diff/noOfRows;
bias = bias - 0.1 * bias_derivative;
vWB.push_back(bias);
}
return vWB;
}
double CPPLogisticRegression::predict(vector<double> vW, double* X_train_test){
static double predictions;
double Wx_test = 0.0;
//computing σ(W.x)
for (int j = 0; j < 13; j++){
Wx_test += (vW[j] * X_train_test[j]);
}
//adding the bias term
predictions = 1/(1 + exp(-(Wx_test + vW.back())));
//making the prediction
if(predictions>0.5){
predictions = 1.0;
}else{
predictions = 0.0;
}
return predictions;
}
extern "C"{
//vector to store the weights and bias gotten from the updateWeightsAndBias() function
vector<double> vX;
CPPLogisticRegression* LogisticRegression(){
CPPLogisticRegression* log_reg = new CPPLogisticRegression();
return log_reg;
}
void fit(CPPLogisticRegression* log_reg) {
vX = log_reg->updateWeightsAndBias(50,9,13);
}
double predict(CPPLogisticRegression* log_reg, double* array){
return log_reg->predict(vX,array);
}
}
```

Before we look at the Python code, let’s create a shared library.

### Creating a shared library

Create a Python file called *setup.py* and add the following code:

```
from setuptools import setup, Extension
module1 = Extension('logistic',
sources = ['logistic.cpp'])
setup (name = 'Logistic Regression Model',
version = '1.0',
description = 'This is a Logistic Regression Model writen in C++',
ext_modules = [module1])
```

The above code creates a shared library called *logistic* from the *logistic.cpp* file. The file will be created in the *build* directory.

Note that this is platform-independent. For Linux, it will create a

.sofile while Windows will produce a.pydfile.

I ran mine on Linux and it produced a file named *logistic.cpython-310-x86_64-linux-gnu.so*. Be sure to check yours.

Run the code using the following command in your terminal:

```
python setup.py build
```

### Python code

As we did for the C++ code, we first import the required modules:

```
import ctypes as ct
import numpy as np
import pandas as pd
```

Next, we load the shared library that we created:

```
#the build file location
libfile = r"build/lib.linux-x86_64-3.10/logistic.cpython-310-x86_64-linux-gnu.so"
#loading it for use
our_lib = ct.CDLL(libfile)
```

We then set the return types for the functions in the `extern C`

section of our C++ file:

```
#setting the return types for our C++ methods
our_lib.fit.argtypes = [ct.c_void_p]
our_lib.predict.argtypes = [ct.c_void_p, np.ctypeslib.ndpointer(dtype=np.float64)]
our_lib.predict.restype = ct.c_double
```

The rest of the code is for initializing the class, creating the array to be added to the `predict()`

method, and displaying the predicted value.

```
#initializing the class
tree_obj = our_lib.LogisticRegression()
#the array to test the model
test_features = np.array((62.0,1.0,1.0,128.0,208.0,1.0,0.0,140.0,0.0,0.0,2.0,0.0,2.0))
test_features = test_features.astype(np.double)
#calling the fit method
predictions = our_lib.fit(tree_obj)
#predictiing
pred = our_lib.predict(tree_obj,test_features)
print("Predicted value:",pred)
```

The full Python code is shown below:

```
import ctypes as ct
import numpy as np
import pandas as pd
#the build file location
libfile = r"build/lib.linux-x86_64-3.10/logistic.cpython-310-x86_64-linux-gnu.so"
#loading it for use
our_lib = ct.CDLL(libfile)
#setting the return types for our C++ methods
our_lib.fit.argtypes = [ct.c_void_p]
our_lib.predict.argtypes = [ct.c_void_p, np.ctypeslib.ndpointer(dtype=np.float64)]
our_lib.predict.restype = ct.c_double
#initializing the class
tree_obj = our_lib.LogisticRegression()
#the array to test the model
test_features = np.array((62.0,1.0,1.0,128.0,208.0,1.0,0.0,140.0,0.0,0.0,2.0,0.0,2.0))
test_features = test_features.astype(np.double)
#calling the fit method
predictions = our_lib.fit(tree_obj)
#predictiing
pred = our_lib.predict(tree_obj,test_features)
print("Predicted value:",pred)
```

### Conclusion

In this tutorial, we discussed the Logistic regression and Gradient Descent optimization algorithms. Then we wrote the C++ code and built a shared library that will be consumed in Python.

You can, therefore, use this knowledge to create your C++ models.

Apart from ctypes, there are other wrapper tools such as CFFI, PyBind11, etc. Have a look at this article for more information about them.

Feel free to suggest changes, improvements, and corrections in the comment section below.

Happy coding!

Peer Review Contributions by: Wanja Mike