Friday, May 25. 2007
I’ve finally improved my script that computes a dilated and rotated 2D Gaussian with respect to low memory consumption. It applies a dilation and rotation matrix to the 2D domain. I published it under the GNU GPLv2, so if you find any improvements, you are forced to share them. I still have the impression that it is rather slow, but I currently don’t know how to do it better. Of course, it has to be evaluated on 7×7=49 tiles (where tile=interval×interval) instead of just 7 intervals, what means that the computing time will increase significantly even for the nondilated and nonrotated case. However, it works. function g=nsgauss (p,q,vdil,hdil,rot )% Computes a nonseparable (dilated + rotated) 2D Gaussian% Usage: g = nsgauss(p, q, vdil, hdil, rot);% Input: p,q .... size of g% vdil ... vertical dilation factor (before rotation)% hdil ... horizontal dilation factor (before rotation)% rot .... rotation angle, e.g. pi/4% Example:% norm(nsgauss(p,q,1,1,0)  gaussnk(p)’*gaussnk(q)) == eps%% Version 0.220070525% by Stephan Paukner {stephan+math at paukner dot cc}% Licensed under the GNU General Public License v2% $Id: nsgauss.m,v 1.2 2007/05/25 10:14:52 ps Exp ps $D= [1/vdil 0; 0 1/hdil ]; %dilation matrixR= [cos(rot )  sin(rot ); sin(rot ) cos(rot )]’; %’%rotation matrixsp= sqrt(p ); sq= sqrt(q ); g= zeros(1,p*q ); for jp= 3: 3 for jq= 3: 3 [x y ]= meshgrid( (0:p 1)/sp + jp*sp , (0:q 1)/sq + jq*sq ); v=D*R* [x (: )’; y (: )’ ]; g=g+ exp( pi* (v (1,: ).^ 2 + v (2,: ).^ 2)); endendg= reshape(g,q,p )’; %’g=g/ norm(g, ’fro’); Here’s a comparison of computing time and accuracy:
> tic; g1=gaussnk(600)’ * gaussnk(800); toc
Elapsed time is 0.100037 seconds.
> tic; g2=nsgauss(600,800,1,1,0); toc
Elapsed time is 19.163170 seconds.
> compnorm(g1,g2);
quotient of norms: norm(x)/norm(y) = 1
difference of normalized versions = 1.344e16 And some plots using different parameters:
Monday, May 21. 2007
Yesterday I finally wrote a routine which computes a nonseparable (dilated and rotated) 2D Gaussian. It takes parameters for horizontal and vertical dilation of the 2D Gaussian which is then rotated by a corresponding parameter. If it is neither dilated nor rotated, it is numerically identical to the pure 2D Gaussian. It has N. Kaiblinger’s gaussnk.m as background when it is computed on an area which is 7 times as large as the desired signal length but then wrapped (summed up) into the smaller region. I won’t publish it yet, as it can be done better; currently I separated the tasks of computing and wrapping, but they could be done at once. With this I could finally reproduce a graphic from the paper “2DGA Based on 1D Algorithms”: It shows (1) a nonseparable 2D atom with relatively prime height and width, (2) the atom mapped to vector shape, (3) the dual of that vector with regard to some combined time and frequency steps, and (4) the dual vector reshaped to an image.
The question remains about how to finally do GA using these two 2D atoms. The only ways I found so far was either building the 1D Gabor system (of the atom shaped as vector) or computing a sampled STFT by that 1D vector; this works because modulations stay modulations. However, in the case of separable 2D atoms, this can be done in another way, as I’ll show in another article.
Friday, May 18. 2007
I always wondered why it didn’t work to compute the dual Gabor atom by using the imagetovector methods I explained previously [1,2,3]. Dr. Kaiblinger showed me that the correct way was to use that special isomorphism that walks along the diagonal of the image. Because width and height have to be relatively prime, that path spans the whole image space. And because there are no jumps over pixels, 2Dmodulations stay 1Dmodulations. This is not yet proved formally, but I can already show first experiments: > p=64; q=75; idx=linind(p,q); %index vector
> img=zeros(p,q); N=p*q
N = 4800
> img(idx(1:50))=1; imagesc(img); %step 50
> img(idx(1:100))=1; imagesc(img); %step 100
> img(idx(1:1000))=1; imagesc(img); %step 1000
> img(idx(1:4000))=1; imagesc(img); %step 4000 A 2Dfrequency is given as a tensor product of two 1Dfrequencies with signal lengths p and q, respectively. If their modulation parameters are given as k_{p} and k_{q}, then the corresponding 2Dmodulation is given by a 1Dmodulation of length N and parameter k_{N} = k_{q} p − k_{p} q (mod N) ; as already said, a proof is not given yet. > kp=4; frp = exp(2*pi*i*(0:p1) * kp/p);
> kq=3; frq = exp(2*pi*i*(0:q1) * kq/q);
> frpq = frp’ * frq; size(frpq)
ans =
64 75
> imagesc(real(frpq))
> kN = mod(kq*pkp*q, N)
kN = 4692
> frN = exp(2*pi*i*(0:N1) * kN/N);
> plot(real(frN(1:1000)))
> frN2=zeros(p,q);
> frN2(idx)=frN;
> compnorm(frpq, frN2)
quotient of norms: norm(x)/norm(y) = 1
difference of normalized versions = 1.478e12
ans = 1.4780e12 So those two 2Dfrequencies are really identical. The plot of the second one is identical to the first one, so we skip it here. Now we want to see if the 2Ddual of a separable 2D atom obtained by that isomorphism is identical to the tensor product of the two 1Dduals.
Continue reading "Images to vectors: Correct isomorphism"
Friday, April 27. 2007
I still have some difficulties understanding sampling lattices in the 4D positionfrequency space of images. Here’s a flow of thoughts on this:
A 1Dsignal of length N can be interpreted as an element of C^{N}. (This was always confusing me: an Ndimensional vector is only a 1Dsignal!) But in TFA it is actually considered as an infinite periodic vector with period N. Therefore the index set is not just {1,...,N} but rather Z_{N}:=Z/NZ. This set is a finite group under addition modulo N. The TFdomain of that 1Dsignal space is Z_{N}×Z_{N} and therefore 2D; it still has the group structure (by components). A sampling subset Λ of this TFplane takes out certain timelocations and frequencies. If one has a fixed window function g with same signal length, the Gabor family with regard to that sampling subset is the set of those shifts and modulations of the Gabor atom g where the timelocations and frequencies are given by Λ. If Λ has enough elements, namely Λ>N, then...??? No, the redundancy doesn’t tell anything about whether the Gabor family is a frame! It is automatically a frame (in the finite setting) as soon as it spans the whole signal space. So what does that redundancy tell us? And why is it automatically large enough as soon as the Gabor family is a frame? Or what? Another open question is why or when one should have a subgroup as sampling subset. What advances does one get when the sampling is done on a subgroup? If Λ is a subgroup, then there is a dual? Is there no dual when Λ is no subgroup? Or does it just depend on the redundancy then? It might have something to do with the fact that the Gabor frame operator commutes with TFshifts along that subgroup. The things get even trickier for 2Dsignals. An image of size p×q is also considered as an infinite periodic signal with periods p in one direction and q in the other. The signal space has N=pq dimensions. The index set is Z_{p}×Z_{q}. The TFdomain (actually positionfrequency domain) becomes (Z_{p}×Z_{q})×(Z_{p}×Z_{q}) and is therefore 4D. What’s unclear here is the term of separability with regard to the sampling subset. In the 1Dcase (TFplane is 2D) one names the sampling subset a separable lattice when the shifts and modulations are defined by the multiples of a divisor of N. I.e., one gets a rectangular grid (lattice) in the TFplane in this case: For every timeshift there is the same set of modulations. A nonseparable case could be given by a set of random sampling points. But these don’t form a subgroup in general. A nonseparable subgroup could look like a rotated version of a separable lattice. For a rotation by 45° (π/4) one gets the special case of a quincunx lattice (if the correct distances are taken). And now to separabililty of a 4Dlattice: What does a 4D quincunx look like? Or another 4D nonseparable lattice? Does this mean that whenever I can split into two 2Dlattices, one talks about separable lattices, independently from the question whether these two are separable again? I think separable means here that one either has a positionlattice ((x_{1},x_{2})lattice) and a frequency lattice ((ω_{1},ω_{2})lattice) or an (x_{1},ω_{1}) and an (x_{2},ω_{2})lattice, independently from whether these are separable themselves or not. But how does one construct a 4Dset out of two 2Dsets with MATLAB/Octave?
Continue reading "Sampling lattices as subgroups"
Wednesday, April 18. 2007
I currently wonder what string gauge (diameter) I should use for my electric guitar. The standard is .009 (i.e. 0.009″ for the high estring). Thinner strings allow easier string bending, but one has to play with less finger pressure to avoid detuning. As I like to tune all strings down by one halftone, the strings get even “softer”. So one should take a higher string gauge when tuning down. The question is now: Do downtuned .010’ers correspond to normally tuned .009’ers? What gauge should one use when one wants to tune down e.g. by a whole tone and have the “softness” of .009’ers? Here’s my try of a calculation: Does a downtune by one octave correspond to a loss of half the tension? Whatever amount the tension will get, it doesn’t decrease linearly with the halftones—remember the different distances between the frets! How does one calculate this scale? You can’t just divide the half of the string length by 12 to get the steps between the frets. Calculating the fret stepping of a guitar For the 12th halftone, you arrive at ½ of the string length, for the 24th halftone you arrive at ¼ of the length, etc.; the 12th divides the length by 2, the 24th divides it by 4, the 36th divides it by 8, etc. Abbreviating d(n) for the divisior at halftone n, we have d(0)=1, d(12)=2, d(24)=4, d(36)=8, etc., i.e. d(12n)=2^{n} and therefore d(n)=2^{n/12}. For the nth halftone, the string length becomes $url1%5Cdisplaystyle%20L_n%20%3A%3D%20%5Cfrac%7BL%7D%7Bd%28n%29%7D%20%3D%20%5Cfrac%7BL%7D%7B2%5E%7Bn%2F12%7D%7D%20%3D%20%5Cfrac%7BL%7D%7B%5Csqrt%5B12%5D%7B2%5En%7D%7D$url2\displaystyle L_n := \frac{L}{d(n)} = \frac{L}{2^{n/12}} = \frac{L}{\sqrt[12]{2^n}}$url3\displaystyle L_n := \frac{L}{d(n)} = \frac{L}{2^{n/12}} = \frac{L}{\sqrt[12]{2^n}}$url4. If you don’t believe my derivation, maybe you believe a (modified) function plot: n=0:24; dv=1./(2.^(n/12));
plot([1dv 1], ones(1, length(dv)+1), ’+r’)
hold; plot(1dv(0+1), 1, ’+b’) %zeroth fret
plot(1dv(12+1), 1, ’+b’) %first octave
plot(1dv(24+1), 1, ’+b’) %second octave
plot(1dv(5+1), 1, ’+m’) %fifth fret
The blue crosses indicate the first two octaves, occurring when ½ (=50%) and ¾ (=75%) of the length are removed. The pink cross indicates the fifth fret at about ¼ (=25.085%) of the length. Every guitarist should recognize the fret stepping here! Gauge stepping when tuning down I found a link which explains that the relation between string diameter δ, tension t and frequency f is $url1%5Cdisplaystyle%20%5Cdelta%20%3D%20C%5Cfrac%7B%5Csqrt%7Bt%7D%7D%7BfL%7D$url2\displaystyle \delta = C\frac{\sqrt{t}}{fL}$url3\displaystyle \delta = C\frac{\sqrt{t}}{fL}$url4, where C is a constant depending on the material. The aforementioned scaling is still valid for the frequencies, i.e., when the frequency f is tuned down by n halftones, the resulting frequency f_{n} is given as $url1%5Cdisplaystyle%20f_n%20%3D%20%5Cfrac%7Bf%7D%7B%5Csqrt%5B12%5D%7B2%5En%7D%7D$url2\displaystyle f_n = \frac{f}{\sqrt[12]{2^n}}$url3\displaystyle f_n = \frac{f}{\sqrt[12]{2^n}}$url4, what can be verified for f=440Hz: The next lower tunes are 415.3Hz (n=1) and 392Hz (n=2), and the next higher tunes are 466.2Hz (n=−1) and 493.9Hz (n=−2). Replacing f by f_{n} in the previous equation yields that when the tension is to be kept, one has to take strings with diameter $url1%5Cdelta_n%20%3D%20%5Cdelta%5Ccdot%5Csqrt%5B12%5D%7B2%5En%7D$url2\delta_n = \delta\cdot\sqrt[12]{2^n}$url3\delta_n = \delta\cdot\sqrt[12]{2^n}$url4. As example, .009’er strings that are tuned down by one whole tone should be replaced by .010’ers to keep the tension of the .009’ers. The other way round, fixing δ_{n}=0.010 (the taken string gauge) and δ=0.009 (the desired string gauge “feel”), one derives $url1%5Cdisplaystyle%20n%20%3D%20%5Clog_2%20%5Cleft%28%20%5Cfrac%7B%5Cdelta_n%7D%7B%5Cdelta%7D%20%5Cright%29%5E%7B12%7D$url2\displaystyle n = \log_2 \left( \frac{\delta_n}{\delta} \right)^{12}$url3\displaystyle n = \log_2 \left( \frac{\delta_n}{\delta} \right)^{12}$url4 what results in n=1.82 in this example, a downtune of slightly less than a whole tone. Tuning down .010’ers a “complete” whole tone corresponds to a string gauge of .0089’ers, so really almost .009’ers. As a final rule of thumb, the steps between the string gauges correspond to tune changes of a whole tone. The following table shows how certain string gauges “feel” when they are tuned down: tuning  .008  .009  .0095  .010  .011  .012  .013  E♭  .0076  .0085  .009  .0094  .0104  .0113  .0123  D  .0071  .008  .0085  .0089  .0098  .0107  .0116  D♭  .0067  .0076  .008  .0084  .0092  .010  .011  C  .0063  .0071  .0075  .0079  .0087  .0095  .0103  B  .006  .0067  .0071  .0075  .0082  .009  .0097 
Thursday, April 12. 2007
Given an image im of size p× q with gcd( p, q)=1.  Find a number s that is not divided by the divisors of N=pq.
 Index vector, method 1:
 idx = mod(1:s:N*s, N);
 nil = find(idx==0); idx(nil)=N;
 Index vector, method 2 (preferred):
 idx = mod((1:s:N*s)1, N)+1;
 imv(idx) = im(:);
Reconstruction from vector of length N= pq:  rim = reshape(imv(idx), p, q);
Let’s try this with our zebra. In my previous tries I simply took p=479 and q=480 to have gcd(479,480)=1, but 479 is a prime number and I therefore have no possibility to divide it by some step size. For GA I have to lay some kind of lattice over the image. So there is actually a second condition forbidding p and q to be prime. I take p=473 instead, it has 11 and 43 as divisors.
> im=img(4:476,:);
> [p q]=size(im), N=p*q
p = 473
q = 480
N = 227040
> alldiv(N)(1:13)
ans =
2 3 4 5 6 8 10 11 12 15 16 20 22 As one can see, N has many divisors, but e.g. 19 is not one of them. We can’t take e.g. 9 although it doesn’t occur either, as it is divided by 3, which is a divisor of N. Taking such a wrong step size will lead one back to the first pixel too early and one can never span the whole image. > s=9; idx=mod((1:s:N*s)1, N)+1;
> find(idx==1)
ans =
1 75681 151361
> s=19; idx=mod((1:s:N*s)1, N)+1;
> find(idx==1)
ans = 1
> find(idx==2)
ans = 23900
> imv(idx) = im(:);
> plot(imv) And now the other way round:
> rim=reshape(imv(idx),p,q);
> size(rim)
ans =
473 480
> imagesc(rim) Indeed looks like our zebra!
By the way, another try of comparing the FFT of the vector with the FFT2 of the image shows again that one cannot get the FFT2 by doing an FFT of the vector. This is because the linepatterns in the image might not be preserved under that transform, as neighboring pixels might not become neighbors in the resulting vector.
Wednesday, April 4. 2007
I found a quicker way to compute the index matrix I mentioned:
> tic; [jj kk]=meshgrid(1:p,1:q); toc
Elapsed time is 0.080879 seconds.
> size(jj), size(kk)
ans =
479 480
ans =
479 480
> tic; idx=mod(b*q*jj+a*p*kk,N)’; toc
Elapsed time is 0.129128 seconds.
> idx(p,q)=N; %valid index value
> imv=zeros(1,N);
> tic; for j=1:p; for k=1:q;
> imv(idx(j,k)) = im(j,k);
> end; end; toc
Elapsed time is 6.770042 seconds. So nothing with 58 seconds anymore.
Tuesday, April 3. 2007
Regarding the previous questions I noticed that GA on LCA groups is too general for my concerns and that the study of GA on finite abelian groups is enough.
In a certain paper HGFei published a result that for p×qimages where p and q are relatively prime there is an isomorphism to a vector of length N=pq. The theory considers the groups $url1%5CbbZ_p%5Ctimes%5CbbZ_q$url2\bbZ_p\times\bbZ_q$url3\bbZ_p\times\bbZ_q$url4 and $url1%5CbbZ_N$url2\bbZ_N$url3\bbZ_N$url4 where $url1%5CbbZ_k%3A%3D%5CbbZ%2Fk%5CbbZ$url2\bbZ_k:=\bbZ/k\bbZ$url3\bbZ_k:=\bbZ/k\bbZ$url4. One explicit mapping from the matrix indices $url1%5Cleft%28%5Cbegin%7Bsmallmatrix%7Dj%5C%5C%20k%5Cend%7Bsmallmatrix%7D%5Cright%29$url2\left(\begin{smallmatrix}j\\ k\end{smallmatrix}\right)$url3\left(\begin{smallmatrix}j\\ k\end{smallmatrix}\right)$url4 to the corresponding vector indices is given by
$url1%5Ciota%5Ccolon%20%5CbbZ_p%5Ctimes%5CbbZ_q%20%5Cto%20%5CbbZ_N%2C%20%5Cquad%20%5Cleft%28%5Cbegin%7Bmatrix%7Dj%5C%5C%20k%5Cend%7Bmatrix%7D%5Cright%29%20%5Cmapsto%20%5Cbeta%20qj%20%2B%20%5Calpha%20pk%20%5Cpmod%20N$url2\iota\colon \bbZ_p\times\bbZ_q \to \bbZ_N, \quad \left(\begin{matrix}j\\ k\end{matrix}\right) \mapsto \beta qj + \alpha pk \pmod N$url3\iota\colon \bbZ_p\times\bbZ_q \to \bbZ_N, \quad \left(\begin{matrix}j\\ k\end{matrix}\right) \mapsto \beta qj + \alpha pk \pmod N$url4
where α and β are integers chosen such that
$url1%5Calpha%20p%2B%5Cbeta%20q%20%5Cequiv%201%20%5Cpmod%20N$url2\alpha p+\beta q \equiv 1 \pmod N$url3\alpha p+\beta q \equiv 1 \pmod N$url4.
My 480×480 zebraimage doesn’t have a height and width that are relatively prime, so I simply drop one pixel of height:
> im=img(2:480,:);
> size(im)
ans =
479 480
> gcd(479,480)
ans = 1
Just for a first test I set α=p and β=q and define a primitive mapping function. (I’ll have to find a quicker transform, as in my tests the switching takes too long. But this could be due to the actually large test image. In addition, I currently only manage to do the backward step by storing the indices in an index matrix.)
> function ii=ma2ve(j,k)
> p=479; q=480; a=p; b=q;
> ii = mod(b*q*j+a*p*k , p*q);
> if ii==0; ii=p*q; end
> endfunction
> ma2ve(1,1)
ans = 1
> ma2ve(1,2)
ans = 229442
> p*q
ans = 229920
I have to return N instead of 0 to have a correct index value. Although the mapping is actually an isomorphism I didn’t know how to go back to the tuple (j,k), so I store the indices in an index matrix while building the image vector:
> idx=zeros(p,q); imv=zeros(1,N);
> tic; for ii=1:p; for jj=1:q;
> idx(ii,jj) = ma2ve(ii,jj);
> imv(idx(ii,jj)) = im(ii,jj);
> end; end; toc
Elapsed time is 58.358711 seconds.
58 seconds!? However, the image looks like this as a vector:
> plot(imv)
Continue reading "Images to vectors and back"
