Friday, April 27. 2007
I still have some difficulties understanding sampling lattices in the 4D position-frequency space of images. Here’s a flow of thoughts on this:
A 1D-signal of length N can be interpreted as an element of CN. (This was always confusing me: an N-dimensional vector is only a 1D-signal!) But in TFA it is actually considered as an infinite periodic vector with period N. Therefore the index set is not just {1,...,N} but rather ZN:=Z/NZ. This set is a finite group under addition modulo N. The TF-domain of that 1D-signal space is ZN×ZN and therefore 2D; it still has the group structure (by components). A sampling subset Λ of this TF-plane takes out certain time-locations and frequencies. If one has a fixed window function g with same signal length, the Gabor family with regard to that sampling subset is the set of those shifts and modulations of the Gabor atom g where the time-locations and frequencies are given by Λ. If Λ has enough elements, namely |Λ|>N, then...??? No, the redundancy doesn’t tell anything about whether the Gabor family is a frame! It is automatically a frame (in the finite setting) as soon as it spans the whole signal space. So what does that redundancy tell us? And why is it automatically large enough as soon as the Gabor family is a frame? Or what? Another open question is why or when one should have a subgroup as sampling subset. What advances does one get when the sampling is done on a subgroup? If Λ is a subgroup, then there is a dual? Is there no dual when Λ is no subgroup? Or does it just depend on the redundancy then? It might have something to do with the fact that the Gabor frame operator commutes with TF-shifts along that subgroup. The things get even trickier for 2D-signals. An image of size p×q is also considered as an infinite periodic signal with periods p in one direction and q in the other. The signal space has N=pq dimensions. The index set is Zp×Zq. The TF-domain (actually position-frequency domain) becomes (Zp×Zq)×(Zp×Zq) and is therefore 4D. What’s unclear here is the term of separability with regard to the sampling subset. In the 1D-case (TF-plane is 2D) one names the sampling subset a separable lattice when the shifts and modulations are defined by the multiples of a divisor of N. I.e., one gets a rectangular grid (lattice) in the TF-plane in this case: For every time-shift there is the same set of modulations. A non-separable case could be given by a set of random sampling points. But these don’t form a subgroup in general. A non-separable subgroup could look like a rotated version of a separable lattice. For a rotation by 45° (π/4) one gets the special case of a quincunx lattice (if the correct distances are taken). And now to separabililty of a 4D-lattice: What does a 4D quincunx look like? Or another 4D non-separable lattice? Does this mean that whenever I can split into two 2D-lattices, one talks about separable lattices, independently from the question whether these two are separable again? I think separable means here that one either has a position-lattice ((x1,x2)-lattice) and a frequency lattice ((ω1,ω2)-lattice) or an (x1,ω1)- and an (x2,ω2)-lattice, independently from whether these are separable themselves or not. But how does one construct a 4D-set out of two 2D-sets with MATLAB/Octave?
Continue reading "Sampling lattices as subgroups"
Thursday, April 12. 2007
Given an image im of size p× q with gcd( p, q)=1. - Find a number s that is not divided by the divisors of N=pq.
- Index vector, method 1:
- idx = mod(1:s:N*s, N);
- nil = find(idx==0); idx(nil)=N;
- Index vector, method 2 (preferred):
- idx = mod((1:s:N*s)-1, N)+1;
- imv(idx) = im(:);
Reconstruction from vector of length N= pq: - rim = reshape(imv(idx), p, q);
Let’s try this with our zebra. In my previous tries I simply took p=479 and q=480 to have gcd(479,480)=1, but 479 is a prime number and I therefore have no possibility to divide it by some step size. For GA I have to lay some kind of lattice over the image. So there is actually a second condition forbidding p and q to be prime. I take p=473 instead, it has 11 and 43 as divisors.
> im=img(4:476,:);
> [p q]=size(im), N=p*q
p = 473
q = 480
N = 227040
> alldiv(N)(1:13)
ans =
2 3 4 5 6 8 10 11 12 15 16 20 22 As one can see, N has many divisors, but e.g. 19 is not one of them. We can’t take e.g. 9 although it doesn’t occur either, as it is divided by 3, which is a divisor of N. Taking such a wrong step size will lead one back to the first pixel too early and one can never span the whole image. > s=9; idx=mod((1:s:N*s)-1, N)+1;
> find(idx==1)
ans =
1 75681 151361
> s=19; idx=mod((1:s:N*s)-1, N)+1;
> find(idx==1)
ans = 1
> find(idx==2)
ans = 23900
> imv(idx) = im(:);
> plot(imv) And now the other way round:
> rim=reshape(imv(idx),p,q);
> size(rim)
ans =
473 480
> imagesc(rim) Indeed looks like our zebra!
By the way, another try of comparing the FFT of the vector with the FFT2 of the image shows again that one cannot get the FFT2 by doing an FFT of the vector. This is because the line-patterns in the image might not be preserved under that transform, as neighboring pixels might not become neighbors in the resulting vector.
Wednesday, April 4. 2007
I found a quicker way to compute the index matrix I mentioned:
> tic; [jj kk]=meshgrid(1:p,1:q); toc
Elapsed time is 0.080879 seconds.
> size(jj), size(kk)
ans =
479 480
ans =
479 480
> tic; idx=mod(b*q*jj+a*p*kk,N)’; toc
Elapsed time is 0.129128 seconds.
> idx(p,q)=N; %valid index value
> imv=zeros(1,N);
> tic; for j=1:p; for k=1:q;
> imv(idx(j,k)) = im(j,k);
> end; end; toc
Elapsed time is 6.770042 seconds. So nothing with 58 seconds anymore.
Tuesday, April 3. 2007
Regarding the previous questions I noticed that GA on LCA groups is too general for my concerns and that the study of GA on finite abelian groups is enough.
In a certain paper HGFei published a result that for p×q-images where p and q are relatively prime there is an isomorphism to a vector of length N=pq. The theory considers the groups and where . One explicit mapping from the matrix indices to the corresponding vector indices is given by
where α and β are integers chosen such that
.
My 480×480 zebra-image doesn’t have a height and width that are relatively prime, so I simply drop one pixel of height:
> im=img(2:480,:);
> size(im)
ans =
479 480
> gcd(479,480)
ans = 1
Just for a first test I set α=p and β=q and define a primitive mapping function. (I’ll have to find a quicker transform, as in my tests the switching takes too long. But this could be due to the actually large test image. In addition, I currently only manage to do the backward step by storing the indices in an index matrix.)
> function ii=ma2ve(j,k)
> p=479; q=480; a=p; b=q;
> ii = mod(b*q*j+a*p*k , p*q);
> if ii==0; ii=p*q; end
> endfunction
> ma2ve(1,1)
ans = 1
> ma2ve(1,2)
ans = 229442
> p*q
ans = 229920
I have to return N instead of 0 to have a correct index value. Although the mapping is actually an isomorphism I didn’t know how to go back to the tuple (j,k), so I store the indices in an index matrix while building the image vector:
> idx=zeros(p,q); imv=zeros(1,N);
> tic; for ii=1:p; for jj=1:q;
> idx(ii,jj) = ma2ve(ii,jj);
> imv(idx(ii,jj)) = im(ii,jj);
> end; end; toc
Elapsed time is 58.358711 seconds.
58 seconds!? However, the image looks like this as a vector:
> plot(imv)
Continue reading "Images to vectors and back"
Thursday, March 15. 2007
I finally managed to scale the reconstructed images appropriately such that one can see at what locations certain 2D-frequencies occur. I FT’ed an image of a zebra and “windowed” the FT with a shifted Gaussian. Doing the inverse FT of that cutout yields a convolution of the input image with a modulated Gaussian, corresponding to
.
This is the zebra, a 480×480 pixel sample I clipped from an image I found in the Wikimedia Commons, and its FFT2:
Again, I rotated the image of the FT by 90° to match the orientation of the “jets” with the line patterns in the zebra. Clearly, the vertically oriented frequencies dominate the image.
Now I window the FT-image by placing a Gaussian at the origin. This results in a low-pass filter. The IFT gives a reconstructed image which only contains the lowest frequencies:
The left half shows the (unmodulated) Gaussian with which the original image has been convolved. The right half shows the reconstructed image—the animal has lost all its zebra patterns! This effect is identical to a Gaussian blur.
Continue reading "Convolving a zebra with modulated Gaussians"
Wednesday, March 14. 2007
I found out that the mentioned symmetry on the FFT2-picture occurs because the modulation in the input image only has a real part and no imaginary part. If one looks at fft2(real(Mg)) where Mg is a modulated Gaussian, one really gets two symmetrically shifted Gaussians as output.
And the Gaussian in the center still comes up because the line patterns don’t span completely between -1 (black) and 1 (white).
I also found a bump2d.m in the NuHAG toolbox which incorporates gaussnk.m, so I can trash my gauss2.m-attempt.
Monday, March 12. 2007
The modulated Gaussian I showed previously was just illustratory and did not consider mathematical properties such as FT-invariance. For that, it has to be periodized, normalized and centered at the borders. The MATLAB script gaussnk.m by N. Kaiblinger from the NuHAG toolbox implements this by evaluating the Gauss function on a wider interval and then wrapping it into the desired signal length. I tried to do that for the 2D case, but here I’m not sure about the term “signal length”, it should actually be the number of pixels. I wrote a first gauss2.m which creates a 2D-Gaussian on a square with almost all desired properties. However, the normalization seems to be wrong, as it does not fulfill , but at least it has and as desired. I have to become familiar with the discrete definition, I don’t know how the norming factors come up; maybe they consider the dimension d as a power.
What I was trying to do was doing localized FFT2’s on a test image. I shifted an arbitrarily selected Gaussian like a spotlight aver an image and did FFT2’s of that. However, on white walls the result didn’t show the desired FT-invariance of the Gaussian. So I had to think of the aforementioned normalizing tasks. Then I was confused that shifts of a Gaussian actually result in modulations on the FT-side. But shouldn’t we want to stay the FT all the same while shifting the Gaussian over a white wall in the input image (and having an unchanged Gaussian at every spot)? Yes, but only as long as we don’t want to be able to invert the FT! Here, we’d need the modulations to get back to the corresponding shifts on the input image. So, if only the spectra are interesting, one should look at the absolute values of the FT which will make the modulation factors disappear (because they have an absolute value of 1).
Another thing I came over was this; look at the following example image: On the left half is the input, and the right half shows the absolute values of the FFT2. The input almost looks like a modulated Gaussian. Why doesn’t it simply shift on the output image? Because in the input, the black values of the surroundings match the black values of the line pattern! If the input should represent a modulated Gaussian, the black lines should have value -1, and the surroundings a gray level of 0. I’ll correct this soon.
I like to rotate the output by 90° to match the orientation of the shifts with that of the lines in the input. Notice that I shifted the mid tones in the output image and that I always get another Gaussian in the center. This is the one I have to get rid of. Also, I get some kind of symmetry, the cause of which I haven’t found out yet but might also be due to the scaling issue.
For that, I visualized the TF- and FT-behavior of my 2D-Gaussian:
Continue reading "Time-frequency shifts of a 2D-Gaussian"
|