Using Auto-encoder for Fraud detection implemented in Knime

Published in

Becoming Human: Artificial Intelligence Magazine

8 min readDec 19, 2022

Auto-encoders are an unsupervised learning technique using neural networks to learn representations.

Specifically, we will design a neural network architecture with a bottleneck that forces a compressed knowledge representation of the original input. This compression and subsequent reconstruction would be complicated if the input features were completely independent of one another. However, if some structure exists in the data (ie. correlations between input features), this structure can be learned and consequently leveraged when forcing the input through the network’s bottleneck.

The dataset for this will be downloaded from here.

Here we have data in the .csv format, so we will use the CSV reader node. Let us configure this node and execute this node.

After examining the dataset, we will divide it into transactions that are legal and those that are not. The value 0 in the class column indicates a legal transaction, and value 1 indicates an illegal transaction or fraud. For this, we will use the Row splitter node. Let’s set this up.

Now, let's split the data for training and validation. For this here we will use the Partitioning node. Let's configure this and execute it and see the output.

Now again split the data for validation. Configure this and execute it.

Now, we’ll use the Normalizer node to normalize the data, and we’ll use min-max normalization. Let’s configure this node, execute it, and see what the output is.

One of the most common methods for normalizing data is min-max normalization. For each feature, the minimum value is converted to a 0, the maximum value is converted to a 1, and all other values are converted to decimals between 0 and 1.

Now concate the output of portioned table and the row splitter table by using Concatenate node.

Configure this node and execute this node.

Now save the normalized model node using the model writer node configure this node and execute it.

configuring this node where we have to save our model

Now also apply the normalized model for the validation or testing data using the normalization apply node. Configure it and execute this node.

The data preprocessing part is completed

Now let's create the model of the autoencoder using the Keras Input layer network node. Configure this node and execute this node.

Now for creating the dense layer or hidden layer we will use the Keras dense layer node. Configure this node and execute it.

Similarly, perform the same operations for all the nodes as shown below.

Let us now apply supervised learning to a Keras deep learning network. Configure and execute this node using the Keras network learner node.

If you find a dependency error in this node please refer to my previous blog.

Here in this node, we will use the loss function equal to MSE and set the Adam optimizer.

The mean squared error is calculated using the average of the squared differences between the predicted and actual values. Regardless of the sign of the predicted and actual values, the result is always positive, and an excellent value is 0.0. The squaring implies that bigger mistakes result in more errors than smaller errors, indicating that the model is penalized for making larger errors.

Configuring the input data in the Keras network learner node

Now let us perform the execution using the Keras executor node. Configure this node and execute the node.

Let's optimize the threshold using the threshold optimization node.

Configure the math formula node and execute this node.

The first row of a data table where new flow variables are defined. The variable names are defined by the column names, and the variable assignments (i.e. the values) are defined by the values in the row. We’ll use the Variable to Table Row node for this.

For extracting the output column here we will use the rule-based engine node, configure this node and execute this node.

Now convert the number to string using the number to string node.

Now let’s observe the accuracy of our model using the scorer node.

Let us now put the model into action. We will use the data created by the writer node in this case. If you can’t find the data, you can download it from here.

Because we have data in the formats .csv, .h5, and table, we will use CSV reader, Model Reader, Keras Network Reader node, and table reader node.

Let’s start with the CSV reader node. configure this node and execute this node.

Now read the normalized model that we have created in the training part using the model reader node.

Similarly, use the Keras network reader to read the Keras model in.h5 format.

Configuring the Keras Network Reader Node

Let us read the data which is in the table format using the table reader node.

Process workflow for reading the data parameter

Now applying the Normalizer(Apply)

Now, let's execute the Keras Network Executor node.

Configuration of Keras network executor node

Similarly, execute the math formula node from the same configuration as before.

Now extract the output using a Rule-based engine node.

Now convert the table row to a variable using the table row to the variable node.

Configuration of a table row to variable node

Now here we will use the case switch start node. Configure this node and execute this node.

If a fraudulent transaction occurs, an email is sent directly to the owner via a send email node.

As you can see, the image above is a final workflow for deployment, and the image above that is a configuration of the send email node.

Thank You!!!

You can DM me on LinkedIn or Instagram if you have any further questions about Knime/Python Development, Machine Learning / Deep Learning ,Coding, Blogging, or Tech Documentation. Special credits to my team members: Siddhid and Anshika

Get your Personality NFT
& Find your Community Everywhere you Go

Using Auto-encoder for Fraud detection implemented in Knime

Get your Personality NFT& Find your Community Everywhere you Go

Written by Anubhav Chaturvedi

Get your Personality NFT
& Find your Community Everywhere you Go