Thursday, October 9, 2008

Activity18: Pattern recognition

Pattern recognition under the domain of machine learning. It has many applications in medicine, gait analysis, speech recognition and text classification. Pattern recognition technique is done through feature extraction and then classification of the gathered features. The test set will then be subjected for comparison based from the features obtained from the training set.

In this activity, we use different objects belonging to different categories which are to be extracted from the data. Below are samples of object images used for pattern recognition technique.

Pillow


Piatos


Fishball


Kwek kwek

For each image, I used R,G,B values, shape and area as classifiers. Each feature is normalized such that the bias to large values or very small values will be removed. 4 samples of each set is used as training set and the other 4 is used as test set.

The table below shows the percent of success of recognition based from the chosen features.


Pillow

Piatos

Fishball

Kwek kwek

Pillow

4/4

1/4

0/4

0/4

Piatos

1/4

3/4

0/4

0/4

Fishball

0/4

1/4

3/4

2/4

Kwek kwek

0/4

1/4

1/4

4/4


Based from the classifiers,the accuracy of recognition is 87.5%. To increase the accuracy of recognition, we should add more features for classification.


Rating: I will give myself 9.0 points. I was not able to achieve the 100%recognition accuracy due to the limited number of features that I can extract based from the image processing techniques that I know.

Tuesday, September 9, 2008

Activity 17: Basic Video Processing

For this activity, we applied basic video and image processing techniques to a diffusive system. We used a drop of a very concentrated black ink and allow it to diffuse in warm water (warm for faster diffusion rate). We used a transparent beaker and a white background for easier thresholding of the resulting images which we will convert to a binary image. Water level is kept low to assume a 2D diffusion. The threshold used is the threshold for the image where the system has uniformly distributed concentration. Pixel values above this threshold is included in the pixel counting which corresponds to the total concentration at that time. The area of diffusing ink is expected to increase (Ref. S.Lee et.al.,"Ink diffusion in water",EJP).

In our experiment,we obtained the same behavior of area as a function of time which can fit with the right fitting parameters.



The ROI is the diffusing ink in the middle. Some boundaries reached or surpassed the threshold which adds to the error in area calculation. This was minimized by labeling the blobs, considering only the middle blob to be counted.

Rating: Based from literature existing, I was able to reproduce the resulting dynamics of an ink diffusing in water. Also, I was able to familiarize myself with some basic video processing techniques and integrate the previous area calculation method. Therefore,I'll give myself 10.0 points.

Code:
t=110/255;
c=[];
for i=1:9
im=imread('C:\Documents and Settings\gpedemonte\Desktop\ap186_a17\vid000'+string(i)+'.png');
imb=im2bw(im, t);
imwrite(imb,'C:\Documents and Settings\gpedemonte\Desktop\recon\vid000'+string(i)+'.png');
c(i)=sum(abs(1-imb));
end
for i=10:99
im=imread('C:\Documents and Settings\gpedemonte\Desktop\ap186_a17\vid00'+string(i)+'.jpg');
imb=im2bw(im, t);
imwrite(imb,'C:\Documents and Settings\gpedemonte\Desktop\recon\vid00'+string(i)+'.png');
c(i)=sum(abs(1-imb));
end
plot(c);

Tuesday, September 2, 2008

Activity 16:Image Segmentation

Similar to previous activities, this time we have to reconstruct a colored image using histogram of RGB.

Figure 1 Sample image used

Figure 2 Cropped image

Figure 3 Reconstructed from the cropped image histogram

Figure 4 RG Histogram

Figure 5 Reconstructed from back projection


Rating: I'll give myself 9.8 points. I used jeric's code,but the algorithm is the same if i use my code. I find it better to use his code.

Code:
ROI=imread('C:\Documents and Settings\gpedemonte\Desktop\ap187-2\activity 3\sample1.jpg');
I=ROI(:,:,1)+ROI(:,:,2)+ROI(:,:,3);
I(find(I==0))=100;
ROI(:,:,1)=ROI(:,:,1)./I;
ROI(:,:,2)=ROI(:,:,2)./I;
ROI(:,:,3)=ROI(:,:,3)./I;

ROI_sub=imread('C:\Documents and Settings\gpedemonte\Desktop\ap187-2\activity 3\sample2.jpg');
I=ROI_sub(:,:,1)+ROI_sub(:,:,2)+ROI_sub(:,:,3);
ROI_sub(:,:,1)=ROI_sub(:,:,1)./I;
ROI_sub(:,:,2)=ROI_sub(:,:,2)./I;
ROI_sub(:,:,3)=ROI_sub(:,:,3)./I;

//probability estimatation
mu_r=mean(ROI_sub(:,:,1)); st_r=stdev(ROI_sub(:,:,1));
mu_g=mean(ROI_sub(:,:,2)); st_g=stdev(ROI_sub(:,:,2));

Pr=1.0*exp(-((ROI(:,:,1)-mu_r).^2)/(2*st_r^2))/(st_r*sqrt(2*%pi));
Pg=1.0*exp(-((ROI(:,:,2)-mu_g).^2)/(2*st_g^2))/(st_r*sqrt(2*%pi));
P=Pr.*Pg;
P=P/max(P);
scf(1);
x=[-1:0.01:1];
Pr=1.0*exp(-((x-mu_r).^2)/(2*st_r))/(st_r*sqrt(2*%pi));
Pg=1.0*exp(-((x-mu_g).^2)/(2*st_g))/(st_g*sqrt(2*%pi));
plot(x,Pr, 'r-', x, Pg, 'g-');
scf(2);
roi=im2gray(ROI);
subplot(211);
imshow(roi, []);
subplot(212);
imshow(P,[]);

//<————-histogram backprohection————————————>
//histogram
r=linspace(0,1, 32);
g=linspace(0,1, 32);
prob=zeros(32, 32);
[x,y]=size(ROI_sub);
for i=1:x
for j=1:y
xr=find(r<=ROI_sub(i,j,1));
xg=find(g<=ROI_sub(i,j,2));
prob(xr(length(xr)), xg(length(xg)))=prob(xr(length(xr)), xg(length(xg)))+1;
end
end
prob=prob/sum(prob);
imshow(prob,[]); xset('colormap', hotcolormap(256));
scf(3)
surf(prob);

//backprojection
[x,y]=size(ROI);
rec=zeros(x,y);
for i=1:x
for j=1:y
xr=find(r<=ROI(i,j,1));
xg=find(g<=ROI(i,j,2));
rec(i,j)=prob(xr(length(xr)), xg(length(xg)));
end
end
scf(4);
imshow(rec, []);

Thursday, August 28, 2008

Activity 15: White Balancing

Correcting images captured under different light conditions such that white appears white is a technique called white balancing. To do this, we have to subject an object to different lighting conditions and compare the whiteness of both object and image.


tunsten

automatic

cloudy

daylight

fluorescent

There is variation in the whiteness in the image for different lighting conditions. There are two methods that can be used in performing white balancing, namely, the Reference White and the Gray World algorithm. Here, we determined first the RGB values of white which is used as a reference. The gray world algorithm on the other hand works by assuming the image is gray. On the other hand, the gray world algorithm is implemented by averaging the R-G-B of each channel in the unbalanced image. The red channel of the unbalanced image is then divided by the R above. The same is done for the blue and green channels.


Rating: I'll give myself 9.0 points. Since I was able to do the activity with a good degree of acccuracy. Thanks to Mark and Jeric for the sample images.

Monday, August 25, 2008

Activity 14: Stereometry

In this activity, we wish to recover the depth information and reconstruct the 3D shape of an object via stereometry. This method requires multiple images of an object (at least 2) where the camera position is varied and the object location is fixed or we can change the position of the object and the camera position is fixed. It should be noted that the image plane is the same as well as the object plane.

Here is the images of a rubik's cube which i used as a sample. I exhausted points in between the grids of the cube for the upper face because it is the only part of the image with an expected variation in depth.

The value of f=14mm and b=10mm. For my case, the resulting image is just the upper face inverted due to arrangement of points in the matrix used and data plotting order. The deeper part is the upper portion of the face.

Rating: I'm convinced that i was able to retrieve the depth information based from the results so I'll give myself a 9.50 points. Manual point location perhaps gave rise to errors which resulted to non-uniformities of points in the reconstruction.

The Code:
x1=[183 214 248 283 320 352 385; 182 214 248 283 320 352 386; 181 213 248 283 321 354 387; 180 212 247 283 321 355 389; 179 211 247 283 322 356 391; 178 210 247 283 323 358 393; 176 209 247 283 323 358 395];
x2=[144 174 209 243 280 313 346; 143 173 209 243 280 313 347; 141 172 209 244 280 314 349; 140 171 208 244 281 315 350; 137 169 208 244 281 316 352; 135 167 207 244 282 317 353; 133 165 206 243 281 318 353];
f=14.;
b=10,;
z=[];
z=(b*f)./(x1-x2);
n = 7 // a regular grid with n x n interpolation points
// will be used
x = linspace(0,1,n); y = x;
//z = cos(x')*cos(y);z = cos(x')*cos(y);
C = splin2d(x, y, z, "periodic");
m = 10; // discretisation parameter of the evaluation grid
xx = linspace(0,1,m); yy = xx;
[XX,YY] = ndgrid(xx,yy);
zz = interp2d(XX,YY, x, y, C);
emax = max(abs(zz - cos(xx')*cos(yy)));
xbasc()
plot3d(xx, yy, zz);//, flag=[2 4 4])
[X,Y] = ndgrid(x,y);
param3d1(X,Y,list(z,-9*ones(1,n)), flag=[0 0])
str = msprintf(" with %d x %d interpolation points. ermax = %g",n,n,emax);
//xtitle("spline interpolation of cos(x)cos(y)"+str)n

Monday, August 11, 2008

Activity 13: Photometric Stereo

Using Photometric stereo technique, we reconstruct the shape of a 3D object by capturing the image of the object at different positions of the light source.

We have to compute for the matrix of reflectance g using the equation


where V is the matrix of the light source positions and I is the image matrix.
Plotting the value of the elevations given by equations



the result is a hemisphere.


Rating: I'll give myself 10. since the reconstructed shape is similar to the original object.


The Code:
loadmatfile(’Photos.mat’);

V(1,: ) = [0.085832 0.17365 0.98106];
V(2,: ) = [0.085832 -0.17365 0.98106];
V(3,: ) = [0.17365 0 0.98481];
V(4,: ) = [0.16318 -0.34202 0.92542];

I(1, : ) = I1(: )’;
I(2, : ) = I2(: )’;
I(3, : ) = I3(: )’;
I(4, : ) = I4(: )’;

g = inv(V’*V)*V’*I;
g = g;
ag = sqrt((g(1,:).*g(1,: ))+(g(2,:).*g(2,: ))+(g(3,: ).*g(3,: )))+1e-6;

for i = 1:3
n(i,: ) = g(i,: )./(ag);
end

//get derivatives

dfx = -n(1,: )./(nz+1e-6);
dfy = -n(2,: )./(nz+1e-6);

//get estimate of line integrals

int1 = cumsum(matrix(dfx,128,128),2);
int2 = cumsum(matrix(dfy,128,128),1);
z = int1+int2;
plot3d(1:128, 1:128, z);

Wednesday, July 30, 2008

Activity 11: Canera Calibration

The objective of this activity is to map the coordinates of an object to the corrdinates of the camera and finally to the image via some transformations.


Figure 1 Mapping of the object to the camera and to the image

Solving the relations from the figure above, we wend up with the equations below. The subscript "i" refers to the image and "o' refers to the object. The final equation is only in 2D since the image is in a single plane.


The equations above can be transformed in matrix form.


We have to append the Q and d matrix as we increase the number of points that will be used to determine the mapping.

Qa = d


solving for the transformation matrix a,



The image used id a picture of a checkerboard


Figure 2 Image pf a checker board

The new pixel locations are the expected output of the program given an object location. I exploited other points and the result is consistent with small error. Errors may arise due to imperfections in the real camera transformations (like the radial distortion mentioned in the lecture).

Collaborators:

Julie for the image and Jeric for a the code

Rating: I'll give myself an 7.0 since it took me long before I figure out how to transform the object to camera and to image. I have to consult and ask for the help of my collaborators.


The Code:

x=[0 0 0 0 1 1 2 0 1 0 3 1 0 3 0 2 0 0 0 1 ];
y=[0 1 0 3 0 0 0 1 0 5 0 0 2 0 2 0 5 3 0 0 ];
z=[0 12 3 2 1 2 5 8 1 3 8 9 6 10 7 7 9 10 9];
yi=[127 139 126 164 115 115 102 139 113 191 90 114 152 88 152 101 193 166 125 112];
zi=[200 187 169 161 172 189 175 123 73 200 161 73 57 112 40 91 97 58 138 56];
obj=[];
im=[];
for i=0:length(x)-1
obj(2*i+1, :)=[x(i+1) y(i+1) z(i+1) 1 0 0 0 0 -yi(i+1).*x(i+1) -yi(i+1).*y(i+1) -yi(i+1).*z(i+1)];
obj(2*i+2, :)=[0 0 0 0 x(i+1) y(i+1) z(i+1) 1 -zi(i+1).*x(i+1) -zi(i+1).*y(i+1) -zi(i+1).*z(i+1)];
im(2*i+1)=yi(i+1);
im(2*i+2)=zi(i+1);
end
a=inv(obj’*obj)*obj’*im;
a(12)=1.0;
a=matrix(a, 4, 3)’;
testx=1;
testy=1;
testz=1;
ty=(a(1,1)*testx+a(1,2)*testy+a(1,3)*testz+a(1,4))/(a(3,1)*testx+a(3,2)*testy+a(3,3)*testz+a(3,4));
tz=(a(2,1)*testx+a(2,2)*testy+a(2,3)*testz+a(2,4))/(a(3,1)*testx+a(3,2)*testy+a(3,3)*testz+a(3,4));
ty
tz

Tuesday, July 22, 2008

A10: Preprocessing HandwrittenText

First, we to sample from the larger image such that details are focused on the texts. By doing so,we can remove unwanted informations by blocking their frequency. The figure below is the original sampled image.Figure 1 sample cropped image

The filter used is designed based from the information obtained from the fft of the image above.

Figure 2 FFT of the sample image

The maximums along the vertical should be blocked to remove the horizontal lines in the original image. But the center should be excluded. The figure below is the filter designed for this particular sample.

Figure 3 Designed filter

After filtering the image by convolution in the Fourier space, the inverse fft will now be the new image to be binarized. The threshold is predetermined using GIMP.

Figure 4 Inverse fft'd image

The image above is binarized and using some erosion and dilation, the unwanted geometries are removed. The resulting binary (thresholded) image, and reconstructed image and labeled image are shown in below.
a.) binary image
b. reconstructed image

c. labeled image


Rating: Based from the reconstructions I've done, no significant enhancement is observed.But, the results have small deviations from the original image. I am confident that I did well in the activity except regarding thereadability so I'll give myself 10.0points.

Code:
im=imread('C:\Documents and Settings\gpedemonte\Desktop\text.bmp');
imf=imread('C:\Documents and Settings\gpedemonte\Desktop\filter4.bmp');

ft1=fft2(im);
ft2=fftshift(imf);
imft=ft1.*ft2;

im3=real(fft2(imft));
scf(1);
imshow(im3,[]);
im4=im2bw(im3, 115/255);
scf(2);
imshow(im4,[]);
//
m=[1 1]
////
im5=erode(im4,m);
scf(3);
imshow(im5,[]);

im6=bwlabel(im5);
scf(4);
imshow(im6,[]);

Wednesday, July 16, 2008

A9: Binary Operations

Given an image of circles (see Figure 1), we have to measure the area of a single circle by sampling different regions of the image. The images are converted to binary depending on the threshold which I determined manually from ImageJ.

Figure 1 Image where the samples are derived

Some circles are not perfect, others are cut during sampling. Hence erroneous values of area will occur upon computation. To resolve this problem, we have to dilate and erode the circles to approximately restore their original shape. I used an opening operator- erosion followed by dilation. I need to erode the image first to remove very small blobs and then dilate it afterwards to cancel the erosion.

The area is estimated by determining the area of a single circle. The method is statistical since there are numerous samples. For each sampled image, each blob is labeled and the number of pixels on that blob is the area for that circle. This is done for the rest of the samples and a plot of the histogram of the areas is obtained.

Figure 2 Histogram of computed areas

The average area obtained from the samples is 540 pixels with a standard deviation of 117 pixels. The histogram has a peak at 540, which means the mean and the mode coincides, enough to say that the result is valid. Another valid method to verify the result is to determine the same area by cropping a single almost perfect circle. Finding the average area and standard deviation from several samples will give a better approximate value.In my case, the resulting average area is 535.0 with a standard deviation of 3 pixels.


Rating: The activity is quite easy (except for the automation). So I'll give myself 10. points

Code: just change the corresponding threshold for each image (see the index)

im=imread('G:\AP 186\a9\c1.jpg');
//converting the image with their corresponding threshold
im1=im2bw(im,0.81);
subplot(121);
imshow(im1, []);
//im2=im2bw(im,0.85);
//im3=im2bw(im,0.78 );
//im4=im2bw(im, 0.81);
//im5=im2bw(im, 0.82);
//im6=im2bw(im, 0.83);
//im7=im2bw(im, 0.83);
//im8=im2bw(im, 0.82);
//im9=im2bw(im, 0.78);
//im10=im2bw(im, 0.82);
//im11=im2bw(im, 0.80);
//im12=im2bw(im, 0.82);
//im13=im2bw(im, 0.76);
//im14=im2bw(im, 0.77);
//im15=im2bw(im, 0.81);
imn1=dilate(erode(im1));
subplot(122);
imshow(imn1, []);
L1=bwlabel(imn1);
area=[max(L1)];
//using pixel counting
for i=1:1:max(L1);
[x,y]=find(L1==i);
area(i)=length(y);
end;
area

Monday, July 14, 2008

A8: Morphological Operations

8


Answers to questions:

1. XOR corresponds to the complement of the union of A and B
2. NOT(A) and B corresponds to the intersection of the (complement of A) and (B)

Dilation and Erosion





















The fihures above are the images to be eroded and dilated with a 4 by 4, 4 by 2, 2 by 4 ones and a cross 5 pixels long and 1 pixel thick.

The results for erosion are the following images corresponding to their original image.

Eroded image of a square

Eroded image of a triangle

Eroded image of a circle

Eroded image of a cross


The eroded images not only followed the shape of the patterns but also reduced its size. In my predictions, I only accounted for the decrease in the size.


The results for dilation are the following images corresponding to their original image.

Dilated image of a square

Dilated image of a triangle

Dilated image of a circle

Dilated image of a cross


The dilated image as predicted inreased in size but again, the shape of the pattern follow by the image was not accounted in the predictions.

Rating: i'll give myself 8 points. I did this activity alone and I missed some of the predicted output.


code:

erosion and dilation: note- just change dilate to erode and vice versa

im=imread('C:\Documents and Settings\Instru\Desktop\a8_cross.bmp');
a1=[1 1 1 1; 1 1 1 1; 1 1 1 1; 1 1 1 1];
a2=[1 1 1 1; 1 1 1 1];
a3=[1 1; 1 1; 1 1; 1 1];
a4=[0 0 1 0 0; 0 0 1 0 0; 1 1 1 1 1; 0 0 1 0 0; 0 0 1 0 0];
im1=dilate(im,a1);
im2=dilate(im,a2);
im3=dilate(im,a3);
im4=dilate(im,a4);
subplot(221);
imshow(im1,[]);
subplot(222);
imshow(im2,[]);
subplot(223);
imshow(im3,[]);
subplot(224);
imshow(im4,[]);