Understanding Neural Style Transfer with Tensorflow and Keras

Neural Style Transfer (NST)  is a fascinating area of Deep Learning and Convolutional Neural Networks. NST is an interesting technique, in which the style from an image, known as the ‘style image’ is transferred to another image ‘content image’ and we get a third a image which is a generated image which has the content of the original image and the style of another image.

NST can be used to reimagine how famous painters like Van Gogh, Claude Monet or a Picasso would have visualised a scenery or architecture. NST uses Convolutional Neural Networks (CNNs) to achieve this artistic style transfer from one image to another. NST was originally implemented by Gati et al., in their paper Neural Algorithm of Artistic Style. Convolutional Neural Networks have been very successful in image classification image recognition et cetera. CNN networks have also been have also generated very interesting pictures using Neural Style Transfer which will be shown in this post. An interesting aspect of CNN’s is that the first couple of layers in the CNN capture basic features of the image like edges and  pixel values. But as we go deeper into the CNN, the network captures higher level features of the input image.

To get started with Neural Style transfer  we will be using the VGG19 pre-trained network. The VGG19 CNN is a compact pre-trained your network which can be used for performing the NST. However, we could have also used Resnet or InceptionV3 networks for this purpose but these are very large networks. The idea of using a network trained on a different task and applying it to a new task is called transfer learning.

What needs to be done to transfer the style from one of the image to another image. This brings us to the question – What is ‘style’? What is it that distinguishes Van Gogh’s painting or Picasso’s cubist art. Convolutional Neural Networks capture basic features in the lower layers and much more complex features in the deeper layers.  Style can be computed by taking the correlation of the feature maps in a layer L. This is my interpretation of how style is captured.  Since style  is intrinsic to  the image, it  implies that the style feature would exist across all the filters in a layer. Hence, to pick up this style we would need to get the correlation of the filters across channels of a lawyer. This is computed mathematically, using the Gram matrix which calculates the correlation of the activation of a the filter by the style image and generated image

To transfer the style from one image to the content image we need to do two parallel operations while doing forward propagation
– Compute the content loss between the source image and the generated image
– Compute the style loss between the style image and the generated image
– Finally we need to compute the total loss

In order to get transfer the style from the ‘style’ image to the ‘content ‘image resulting in a  ‘generated’  image  the total loss has to be minimised. Therefore backward propagation with gradient descent  is done to minimise the total loss comprising of the content and style loss.

Initially we make the Generated Image ‘G’ the same as the source image ‘S’

The content loss at layer ‘l’

$L_{content} = 1/2 \sum_{i}^{j} ( F^{l}_{i,j} - P^{l}_{i,j})^{2}$

where $F^{l}_{i,j}$ and $P^{l}_{i,j}$ represent the activations at layer ‘l’ in a filter i, at position ‘j’. The intuition is that the activations will be same for similar source and generated image. We need to minimise the content loss so that the generated stylized image is as close to the original image as possible. An intermediate layer of VGG19 block5_conv2 is used

The Style layers that are are used are

style_layers = [‘block1_conv1’,
‘block2_conv1’,
‘block3_conv1’,
‘block4_conv1’,
‘block5_conv1’]
To compute the Style Loss the Gram matrix needs to be computed. The Gram Matrix is computed by unrolling the filters as shown below (source: Convolutional Neural Networks by Prof Andrew Ng, Coursera). The result is a matrix of size $n_{c}$ x $n_{c}$ where $n_{c}$ is the number of channels
The above diagram shows the filters of height $n_{H}$ and width $n_{W}$ with $n_{C}$ channels
The contribution of layer ‘l’ to style loss is given by
$L^{'}_{style} = \frac{\sum_{i}^{j} (G^{2}_{i,j} - A^l{i,j})^2}{4N^{2}_{l}M^{2}_{l}}$
where $G_{i,j}$  and $A_{i,j}$ are the Gram matrices of the style and generated images respectively. By minimising the distance in the gram matrices of the style and generated image we can ensure that generated image is a stylized version of the original image similar to the style image
The total loss is given by
$L_{total} = \alpha L_{content} + \beta L_{style}$
Back propagation with gradient descent works to minimise the content loss between the source and generated image, while the style loss tries to minimise the discrepancies in the style of the style image and generated image. Running through forward and backpropagation through several epochs successfully transfers the style from the style image to the source image.
You can check the Notebook at Neural Style Transfer

Note: The code in this notebook is largely based on the Neural Style Transfer tutorial from Tensorflow, though I may have taken some changes from other blogs. I also made a few changes to the code in this tutorial, like removing the scaling factor, or the class definition (Personally, I belong to the old school (C language) and am not much in love with the ‘self.”..All references are included below

Note: Here is a interesting thought. Could we do a Neural Style Transfer in music? Imagine Carlos Santana playing ‘Hotel California’ or Brian May style in ‘Another brick in the wall’. While our first reaction would be that it may not sound good as we are used to style of these songs, we may be surprised by a possible style transfer. This is definitely music to the ears!

Here are few runs from this

A) Run 1

1. Neural Style Transfer – a) Content Image – My portrait.  b) Style Image – Wassily Kadinsky Oil on canvas, 1913, Vassily Kadinsky’s composition

2. Result of Neural Style Transfer

2) Run 2

a) Content Image – Portrait of my parents b) Style Image –  Vincent Van Gogh’s ,Starry Night Oil on canvas 1889

2. Result of Neural Style Transfer

Run 3

1.  Content Image – Caesar 2 (Masai Mara- 20 Jun 2018).  Style Image – The Great Wave at Kanagawa – Katsushika Hokosai, 1826-1833

2. Result of Neural Style Transfer

Run 4

1.   Content Image – Junagarh Fort , Rajasthan   Sep 2016              b) Style Image – Le Pont Japonais by Claude Monet, Oil on canvas, 1920

2. Result of Neural Style Transfer

Neural Style Transfer is a very ingenious idea which shows that we can segregate the style of a painting and transfer to another image.

References

1. A Neural Algorithm of Artistic Style, Leon A. Gatys, Alexander S. Ecker, Matthias Bethge
2. Neural style transfer
3. Neural Style Transfer: Creating Art with Deep Learning using tf.keras and eager execution
4. Convolutional Neural Network, DeepLearning.AI Specialization, Prof Andrew Ng
5. Intuitive Guide to Neural Style Transfer

To see all posts click Index of posts

De-blurring revisited with Wiener filter using OpenCV

In this post I continue to experiment with the de-blurring of images using the Wiener filter. For details on the Wiener filter, please look at my earlier post “Dabbling with Wiener filter using OpenCV”.  Thanks to Egli Simon, Switzerland for pointing out a bug in the earlier post which I have now fixed.  The Wiener filter attempts to de-blur by assuming that the source signal is convolved with a blur kernel in the presence of noise.   I have also included the blur kernel as estimated by E. Simon in the code. I am including the de-blurring with 3 different blur kernel radii and different values for the Wiener constant K.  While the de-blurring is still a long way off there is some success.

One of the reasons I have assumed a non-blind blur kernel and try to de-convolve with that. The Wiener filter tries to minimize the Mean Square Error (MSE)  which can be expressed as
e(f) = E[X(f) – X1(f)]^2                                – (1)

where e(f) is the Mean Square Error(MSE) in the frequency domain, X(f) is the original image and X1(f) is the estimated signal which we get by de-convolving the Wiener filter with the observed blurred image i.e. and E[] is the expectation

X1(f) = G(f) * Y(f)                                        -(2)
where G(f) is the Wiener de-convolution filter and Y(f) is the observed blurred image
Substituting (2) in (1) we get
e(f) = E[X(f) – G(f) *Y(f)]^2

If the above equation is solved we can effectively remove the blur.
Feel free to post comments, opinions or ideas.

Watch this space …
I will be back! Hasta la Vista!

Note: You can clone the code below from Git Hib – Another implementation of Weiner filter in OpenCV

The complete code is included below and should work as is

/*
============================================================================
Name : deblur_wiener.c
Author : Tinniam V Ganesh & Egli Simon
Version :
Description : Implementation of Wiener filter in OpenCV
============================================================================
*/

#include <stdio.h>
#include <stdlib.h>
#include “cxcore.h”
#include “cv.h”
#include “highgui.h”

#define kappa 10.0

int main(int argc, char ** argv)
{
int height,width,step,channels,depth;
uchar* data1;

CvMat *dft_A;
CvMat *dft_B;

CvMat *dft_C;
IplImage* im;
IplImage* im1;

IplImage* image_ReB;
IplImage* image_ImB;

IplImage* image_ReC;
IplImage* image_ImC;
IplImage* complex_ImC;
CvScalar val;

IplImage* k_image_hdr;
int i,j,k;
char str[80];
FILE *fp;
fp = fopen(“test.txt”,”w+”);

int dft_M,dft_N;
int dft_M1,dft_N1;

CvMat* cvShowDFT1(IplImage*, int, int,char*);
void cvShowInvDFT1(IplImage*, CvMat*, int, int,char*);

cvNamedWindow(“Original-color”, 0);
cvShowImage(“Original-color”, im1);
if( !im )
return -1;

cvNamedWindow(“Original-gray”, 0);
cvShowImage(“Original-gray”, im);

// Create a random noise matrix
fp = fopen(“test.txt”,”w+”);
int val_noise[357*383];
for(i=0; i <im->height;i++){
for(j=0;j<im->width;j++){
fprintf(fp, “%d “,(383*i+j));
val_noise[383*i+j] = rand() % 128;
}
fprintf(fp, “\n”);
}

CvMat noise = cvMat(im->height,im->width, CV_8UC1,val_noise);

// Add the random noise matric to the image

cvNamedWindow(“Original + Noise”, 0);
cvShowImage(“Original + Noise”, im);

cvSmooth( im, im, CV_GAUSSIAN, 7, 7, 0.5, 0.5 );
cvNamedWindow(“Gaussian Smooth”, 0);
cvShowImage(“Gaussian Smooth”, im);

// Create a blur kernel
IplImage* k_image;

printf(“rowlength %d\n”,rowLength);
float kernels[rowLength*rowLength];
printf(“rowl: %i”,rowLength);
int norm=0; //Normalization factor
int x,y;
CvMat kernel;
for(x = 0; x < rowLength; x++)
for (y = 0; y < rowLength; y++)
norm++;
// Populate matrix
for (y = 0; y < rowLength; y++) //populate array with values
{
for (x = 0; x < rowLength; x++) {
//kernels[y * rowLength + x] = 255;
kernels[y * rowLength + x] =1.0/norm;
printf(“%f “,1.0/norm);
}
else{
kernels[y * rowLength + x] =0;
}
}
}

/*for (i=0; i < rowLength; i++){
for(j=0;j < rowLength;j++){
printf(“%f “, kernels[i*rowLength +j]);
}
}*/

kernel= cvMat(rowLength, // number of rows
rowLength, // number of columns
CV_32FC1, // matrix data type
&kernels);
k_image = cvGetImage(&kernel,k_image_hdr);

height = k_image->height;
width = k_image->width;
step = k_image->widthStep/sizeof(float);
depth = k_image->depth;

channels = k_image->nChannels;
//data1 = (float *)(k_image->imageData);
data1 = (uchar *)(k_image->imageData);
cvNamedWindow(“blur kernel”, 0);
cvShowImage(“blur kernel”, k_image);

dft_M = cvGetOptimalDFTSize( im->height – 1 );
dft_N = cvGetOptimalDFTSize( im->width – 1 );

//dft_M1 = cvGetOptimalDFTSize( im->height+99 – 1 );
//dft_N1 = cvGetOptimalDFTSize( im->width+99 – 1 );

dft_M1 = cvGetOptimalDFTSize( im->height+3 – 1 );
dft_N1 = cvGetOptimalDFTSize( im->width+3 – 1 );

printf(“dft_N1=%d,dft_M1=%d\n”,dft_N1,dft_M1);

// Perform DFT of original image
dft_A = cvShowDFT1(im, dft_M1, dft_N1,”original”);
//Perform inverse (check)
//cvShowInvDFT1(im,dft_A,dft_M1,dft_N1, “original”); – Commented as it overwrites the DFT

// Perform DFT of kernel
dft_B = cvShowDFT1(k_image,dft_M1,dft_N1,”kernel”);
//Perform inverse of kernel (check)
//cvShowInvDFT1(k_image,dft_B,dft_M1,dft_N1, “kernel”);- Commented as it overwrites the DFT

// Multiply numerator with complex conjugate
dft_C = cvCreateMat( dft_M1, dft_N1, CV_64FC2 );

printf(“%d %d %d %d\n”,dft_M,dft_N,dft_M1,dft_N1);

// Multiply DFT(blurred image) * complex conjugate of blur kernel
cvMulSpectrums(dft_A,dft_B,dft_C,CV_DXT_MUL_CONJ);
//cvShowInvDFT1(im,dft_C,dft_M1,dft_N1,”blur1″);

// Split Fourier in real and imaginary parts
image_ReC = cvCreateImage( cvSize(dft_N1, dft_M1), IPL_DEPTH_64F, 1);
image_ImC = cvCreateImage( cvSize(dft_N1, dft_M1), IPL_DEPTH_64F, 1);
complex_ImC = cvCreateImage( cvSize(dft_N1, dft_M1), IPL_DEPTH_64F, 2);
printf(“%d %d %d %d\n”,dft_M,dft_N,dft_M1,dft_N1);

//cvSplit( dft_C, image_ReC, image_ImC, 0, 0 );
cvSplit( dft_C, image_ReC, image_ImC, 0, 0 );

// Compute A^2 + B^2 of denominator or blur kernel
image_ReB = cvCreateImage( cvSize(dft_N1, dft_M1), IPL_DEPTH_64F, 1);
image_ImB = cvCreateImage( cvSize(dft_N1, dft_M1), IPL_DEPTH_64F, 1);

// Split Real and imaginary parts
cvSplit( dft_B, image_ReB, image_ImB, 0, 0 );
cvPow( image_ReB, image_ReB, 2.0);
cvPow( image_ImB, image_ImB, 2.0);
val = cvScalarAll(kappa);

//Divide Numerator/A^2 + B^2
cvDiv(image_ReC, image_ReB, image_ReC, 1.0);
cvDiv(image_ImC, image_ReB, image_ImC, 1.0);

// Merge Real and complex parts
cvMerge(image_ReC, image_ImC, NULL, NULL, complex_ImC);

// Perform Inverse
cvShowInvDFT1(im, (CvMat *)complex_ImC,dft_M1,dft_N1,str);

cvWaitKey(-1);
return 0;
}

CvMat* cvShowDFT1(IplImage* im, int dft_M, int dft_N,char* src)
{
IplImage* realInput;
IplImage* imaginaryInput;
IplImage* complexInput;
CvMat* dft_A, tmp;
IplImage* image_Re;
IplImage* image_Im;
char str[80];
double m, M;
realInput = cvCreateImage( cvGetSize(im), IPL_DEPTH_64F, 1);
imaginaryInput = cvCreateImage( cvGetSize(im), IPL_DEPTH_64F, 1);
complexInput = cvCreateImage( cvGetSize(im), IPL_DEPTH_64F, 2);
cvScale(im, realInput, 1.0, 0.0);
cvZero(imaginaryInput);
cvMerge(realInput, imaginaryInput, NULL, NULL, complexInput);

dft_A = cvCreateMat( dft_M, dft_N, CV_64FC2 );
image_Re = cvCreateImage( cvSize(dft_N, dft_M), IPL_DEPTH_64F, 1);
image_Im = cvCreateImage( cvSize(dft_N, dft_M), IPL_DEPTH_64F, 1);

// copy A to dft_A and pad dft_A with zeros
cvGetSubRect( dft_A, &tmp, cvRect(0,0, im->width, im->height));
cvCopy( complexInput, &tmp, NULL );
if( dft_A->cols > im->width )
{
cvGetSubRect( dft_A, &tmp, cvRect(im->width,0, dft_A->cols – im->width, im->height));
cvZero( &tmp );
}

// no need to pad bottom part of dft_A with zeros because of
// use nonzero_rows parameter in cvDFT() call below
cvDFT( dft_A, dft_A, CV_DXT_FORWARD, complexInput->height );
strcpy(str,”DFT -“);
strcat(str,src);
cvNamedWindow(str, 0);

// Split Fourier in real and imaginary parts
cvSplit( dft_A, image_Re, image_Im, 0, 0 );

// Compute the magnitude of the spectrum Mag = sqrt(Re^2 + Im^2)
cvPow( image_Re, image_Re, 2.0);
cvPow( image_Im, image_Im, 2.0);
cvPow( image_Re, image_Re, 0.5 );

// Compute log(1 + Mag)
cvAddS( image_Re, cvScalarAll(1.0), image_Re, NULL ); // 1 + Mag
cvLog( image_Re, image_Re ); // log(1 + Mag)

cvMinMaxLoc(image_Re, &m, &M, NULL, NULL, NULL);
cvScale(image_Re, image_Re, 1.0/(M-m), 1.0*(-m)/(M-m));
cvShowImage(str, image_Re);
return(dft_A);
}

void cvShowInvDFT1(IplImage* im, CvMat* dft_A, int dft_M, int dft_N,char* src)
{
IplImage* realInput;
IplImage* imaginaryInput;
IplImage* complexInput;
IplImage * image_Re;
IplImage * image_Im;
double m, M;
char str[80];
realInput = cvCreateImage( cvGetSize(im), IPL_DEPTH_64F, 1);
imaginaryInput = cvCreateImage( cvGetSize(im), IPL_DEPTH_64F, 1);
complexInput = cvCreateImage( cvGetSize(im), IPL_DEPTH_64F, 2);
image_Re = cvCreateImage( cvSize(dft_N, dft_M), IPL_DEPTH_64F, 1);
image_Im = cvCreateImage( cvSize(dft_N, dft_M), IPL_DEPTH_64F, 1);

//cvDFT( dft_A, dft_A, CV_DXT_INV_SCALE, complexInput->height );
cvDFT( dft_A, dft_A, CV_DXT_INV_SCALE, dft_M);
strcpy(str,”DFT INVERSE – “);
strcat(str,src);
cvNamedWindow(str, 0);

// Split Fourier in real and imaginary parts
cvSplit( dft_A, image_Re, image_Im, 0, 0 );

// Compute the magnitude of the spectrum Mag = sqrt(Re^2 + Im^2)
cvPow( image_Re, image_Re, 2.0);
cvPow( image_Im, image_Im, 2.0);
cvPow( image_Re, image_Re, 0.5 );

// Compute log(1 + Mag)
cvAddS( image_Re, cvScalarAll(1.0), image_Re, NULL ); // 1 + Mag
cvLog( image_Re, image_Re ); // log(1 + Mag)

cvMinMaxLoc(image_Re, &m, &M, NULL, NULL, NULL);
cvScale(image_Re, image_Re, 1.0/(M-m), 1.0*(-m)/(M-m));
//cvCvtColor(image_Re, image, CV_GRAY2RGBA);
cvShowImage(str, image_Re);
}
– De-blurring revisited with Wiener filter using OpenCV
–  Dabbling with Wiener filter using OpenCV
– Deblurring with OpenCV: Wiener filter reloaded
– Re-working the Lucy-Richardson Algorithm in OpenCV

Computer Vision: Getting started with OpenCV

OpenCV (Open Source Computer Vision) is a library of APIs aimed at enabling real time computer vision. OpenCV had Intel as  the champion during its early developmental stages from 1999 to its first formal release in 2006. This post looks at the steps    involved in installing OpenCV on your Linux machine and getting started with some simple programs. For the steps for installing OpenCV in Windows look at my post “Installing and using OpenCV with Visual Studio 2010 express

$HOME/opencv Unzip and and untar the bzipped file using$ tar jxvf OpenCV-2.3.1a.tar.bz2

Now

$cd OpenCV-2.3.1 You have to run cmake to configure the directories before running make. I personally found it easier to run the cmake wizard using$ cmake -i

Follow through all the prompts that the cmake wizard gives and make appropriate choices.

Once this complete

Run

$make Login as root$ su – root

Run

$make install This should install all the appropriate files and libraries in /usr/local/lib Now assuming that everything is fine you should be good to go. Start Eclipse, open a new C project. Under Project->Properties->Settings->GCC Compiler ->Directories include the following 2 include paths ../opencv/OpenCV-2.3.1/include/opencv ../opencv/OpenCV-2.3.1/include/opencv2 Under Project->Properties->Settings->GCC Linker ->Libraries in the library search path include /usr/local/lib Under libraries include the following opencv_highgui , opencv_core , opencv_imgproc , opencv_highgui, opencv_ml, opencv_video, opencv_features2d, opencv_calib3d , opencv_objdetect, opencv_contrib, opencv_legacy, opencv_flann (To get the list of libraries you could also run the following command) pkg-config –libs /usr/local/lib/pkgconfig/opencv.pc -L/usr/local/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_ml -lopencv_video -lopencv_features2d -lopencv_calib3d -lopencv_objdetect -lopencv_contrib -lopencv_legacy -lopencv_flann Now you are ready to create your first OpenCV program The one below will convert an image to test.png #include “highgui.h” int main( int argc, char** argv ) { IplImage* img = cvLoadImage( argv[1],1); cvSaveImage( “test.png”, img, 0); cvReleaseImage( &img ); return 0; } If you get a runtime error cannot find shared library “ibopencv_core.so.2.3: cannot open shared object file: No such file or directory” then you need to ensure that the linker knows the paths of the libraries. The commands are as follows$vi /etc/ld.so.conf.d/opencv.conf

Enter

/usr/local/lib

and save file

Now execute

$ldconfig /etc/ld.so.conf You can check if everything is fine by running$[root@localhost mycode]# ldconfig -v | grep open

ldconfig: /etc/ld.so.conf.d/kernel-2.6.32.26-175.fc12.i686.PAE.conf:6: duplicate hwcap 0 nosegneg

libopencv_gpu.so.2.3 -> libopencv_gpu.so.2.3.1

libopencv_ml.so.2.3 -> libopencv_ml.so.2.3.1

libopencv_legacy.so.2.3 -> libopencv_legacy.so.2.3.1

libopencv_objdetect.so.2.3 -> libopencv_objdetect.so.2.3.1

libopencv_video.so.2.3 -> libopencv_video.so.2.3.1

…..

Now re-build the code and everything should be fine.

Here’s a second program to run various transformations to an image

// create a window. Window name is determined by a supplied argument

cvNamedWindow( argv[1], CV_WINDOW_AUTOSIZE );

// Apply Gaussian smooth

//cvSmooth( img, img, CV_GAUSSIAN, 9, 9, 0, 0 );

cvErode (img,img,NULL,2);

// Display an image inside and window.

cvShowImage( argv[1], img );

//Save image

cvSaveImage( “/home/ganesh/Desktop/baby2.png“, img, 0);

….

There are many samples also downloaded along with the installation. You can try them out. I found the facedetect.cpp sample interesting. It is based on Haar cacades and works really well. Compile facedetect.cpp under samples/c

Check it out. Including facedetect.cpp detecting my face in real time …

Have fun with OpenCV.

Get going! Get hooked!