Building a Siamese CNN for Fingerprint Recognition: A Journey from Concept to Implementation

5 minute read

Published:

The idea for this project stemmed from a collaboration with my friend Jovan on his Bachelor’s thesis. His concept was to use a Siamese Convolutional Neural Network (Siamese CNN) for fingerprint recognition, structured as follows: This blog outlines how we implemented this in Python & Keras, while dealing with dataset augmentation, Siamese architecture, and model validation. You can explore the project’s Git repo and Jupyter Notebooks.

  1. Two parallel CNN branches (with shared weights) extract feature representations from input fingerprint images.
  2. Feature subtraction is applied to compare the extracted features.
  3. A final convolutional + pooling layer processes the difference.
  4. A sigmoid layer classifies whether the fingerprints match.

This blog details the step-by-step implementation of this idea, covering dataset preparation, augmentation, model architecture, training, and evaluation.

image


Dataset Preparation and Label Extraction

Our fingerprint dataset follows a strict naming convention. For instance:

001__M_Left_index_SWarp.BMP

From this, we extract four key labels:

  • Subject ID (001)
  • Gender (M = 0, F = 1)
  • Hand (Left = 0, Right = 1)
  • Finger type (Index = 1)

We implemented this parsing logic in the extract_label function:

def extract_label(img_path):
    ...
    gender = 0 if gender_str == 'M' else 1
    hand = 0 if hand_str == 'Left' else 1
    finger_map = {'thumb': 0, 'index': 1, 'middle': 2, 'ring': 3, 'little': 4}
    ...

We load and preprocess all Altered-Easy images by resizing them to 90x90 pixels and saving both the image arrays and label arrays:

images = np.empty((num_images, 90, 90), dtype=np.uint8)
labels = np.empty((num_images, 4), dtype=np.uint16)
...
images[i] = cv2.resize(image, target_size)
labels[i] = extract_label_alt(image_path)

Now, our dataset consists of four subsets:

  • Real (original fingerprints)
  • Easy, Medium, and Hard (distorted fingerprints with increasing complexity)

Loading the Dataset

We use NumPy to load the dataset, which contains images and corresponding labels:

def load_dataset():
    datasets = ['real', 'easy', 'medium', 'hard']
    data_dict = {}
    for dataset in datasets:
        x_data = np.load(f'dataset/x_{dataset}.npy')
        y_data = np.load(f'dataset/y_{dataset}.npy')
        data_dict[dataset] = (x_data, y_data)
        print(f"{dataset}: X shape {x_data.shape}, Y shape {y_data.shape}")
    return data_dict

data_dict = load_dataset()

Data Visualization

A quick visualization ensures that images are loaded correctly:

def visualize_samples(data_dict):
    plt.figure(figsize=(15, 10))
    for idx, (key, (x, y)) in enumerate(data_dict.items(), start=1):
        plt.subplot(1, 4, idx)
        plt.title(y[0])
        plt.imshow(x[0].squeeze(), cmap='gray')
    plt.show()

visualize_samples(data_dict)

Train-Test Split

We combine easy, medium, and hard subsets to create a training set and reserve 10% for validation:

x_data = np.concatenate([data_dict['easy'][0], data_dict['medium'][0], data_dict['hard'][0]], axis=0)
label_data = np.concatenate([data_dict['easy'][1], data_dict['medium'][1], data_dict['hard'][1]], axis=0)

x_train, x_val, label_train, label_val = train_test_split(x_data, label_data, test_size=0.1)

Data Augmentation

Since imgaug is not compatible with NumPy 2.x, we downgraded NumPy:

pip install "numpy<2.0"

We use image augmentation to improve model generalization:

import imgaug.augmenters as iaa

def preview_augmentation(images):
    seq = iaa.Sequential([
        iaa.GaussianBlur(sigma=(0, 0.5)),
        iaa.Affine(
            scale={"x": (0.9, 1.1), "y": (0.9, 1.1)},
            translate_percent={"x": (-0.1, 0.1), "y": (-0.1, 0.1)},
            rotate=(-30, 30),
            order=[0, 1],
            cval=255
        )
    ], random_order=True)

    augmented_images = seq.augment_images(images)
    
    plt.figure(figsize=(16, 6))
    for i, aug in enumerate(augmented_images):
        plt.subplot(2, 5, i + 1)
        plt.imshow(aug.squeeze(), cmap='gray')
    plt.show()

preview_augmentation([x_data[40000]] * 9)

Data Generator

Since the dataset is large, we use a custom Keras DataGenerator to load images in batches during training.

class DataGenerator(keras.utils.Sequence):
    def __init__(self, x, label, x_real, label_real_dict, batch_size=32, shuffle=True):
        self.x = x
        self.label = label
        self.x_real = x_real
        self.label_real_dict = label_real_dict
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.on_epoch_end()

    def __getitem__(self, index):
        x1_batch = self.x[index * self.batch_size:(index + 1) * self.batch_size]
        label_batch = self.label[index * self.batch_size:(index + 1) * self.batch_size]

        x2_batch = np.empty((self.batch_size, 90, 90, 1), dtype=np.float32)
        y_batch = np.zeros((self.batch_size, 1), dtype=np.float32)

        for i, l in enumerate(label_batch):
            match_key = ''.join(l.astype(str)).zfill(6)
            if random.random() > 0.5:
                x2_batch[i] = self.x_real[self.label_real_dict[match_key]][..., np.newaxis]
                y_batch[i] = 1.
            else:
                while True:
                    unmatch_key, unmatch_idx = random.choice(list(self.label_real_dict.items()))
                    if unmatch_key != match_key:
                        x2_batch[i] = self.x_real[unmatch_idx][..., np.newaxis]
                        break
                y_batch[i] = 0.

        return [x1_batch.astype(np.float32) / 255., x2_batch.astype(np.float32) / 255.], y_batch

train_gen = DataGenerator(x_train, label_train, data_dict['real'][0], label_real_dict, shuffle=True)
val_gen = DataGenerator(x_val, label_val, data_dict['real'][0], label_real_dict, shuffle=False)

Model Architecture

There are many different ways to solve this problem (see Notebooks on Kaggle), but we chose to implement Siamese CNN: image

from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout, Subtract
from tensorflow.keras.models import Model

def create_base_network(input_shape):
    input = Input(shape=input_shape)
    x = Conv2D(32, (3, 3), activation='relu', padding='same')(input)
    x = MaxPooling2D(pool_size=(2, 2))(x)
    x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)
    x = Flatten()(x)
    return Model(input, x)

input_shape = (90, 90, 1)
base_network = create_base_network(input_shape)

input_a = Input(shape=input_shape)
input_b = Input(shape=input_shape)

processed_a = base_network(input_a)
processed_b = base_network(input_b)

subtracted = Subtract()([processed_a, processed_b])
x = Dense(128, activation='relu')(subtracted)
output = Dense(1, activation='sigmoid')(x)

model = Model(inputs=[input_a, input_b], outputs=output)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

Training

history = model.fit(train_gen, epochs=15, validation_data=val_gen)

Model Evaluation

We visualize model performance using Confusion Matrix, Precision-Recall Curve, and ROC Curve.

image

from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns

y_pred_labels = model.predict(val_gen) > 0.5
cm = confusion_matrix(y_true, y_pred_labels)

sns.heatmap(cm, annot=True, fmt='d', cmap='viridis')
plt.show()

print(classification_report(y_true, y_pred_labels))

image

Results Summary

  • Precision: 1.00 for matching fingerprints.
  • Recall: 0.73 for matching fingerprints (improvement needed).
  • Overall Accuracy: 86%.
  • ROC AUC: 1.00 (near-perfect classification).

Conclusion

This project successfully implemented a Siamese CNN for fingerprint recognition. Future improvements include class-balanced loss functions, adversarial training, and threshold tuning to enhance recall. This approach can be extended to:

  • Face or iris verification
  • Signature comparison
  • Duplicate detection in document datasets

Jovan’s idea has great potential for real-world security applications! 🚀