Sıkıştırılmış algılama analogları

22

$x \in \mathbb{R}^n$ $\|x\|_0 < k$ $Ax$ $A$ $R$ $n$ $R \ll n$ $A$ $k$ -sparse ile kadar küçük . En iyi bilinen parametrelere sahip olmayabilirim ama bu genel fikir. $x$ $R$ $O(k n^{o(1)})$

Sorum şu: diğer ortamlarda benzer olaylar var mı? Demek istediğim, giriş sinyalinin, zorunlu olmayan bir karmaşıklık ölçüsüne göre bazı "düşük karmaşıklık ailesinden" gelebileceği. Daha sonra, verimli ve doğru olan mutlaka doğrusal haritalar değil, sıkıştırma ve açma algoritmaları istiyoruz. Bu tür sonuçlar farklı bir bağlamda biliniyor mu? Tahmininiz daha "genel" bir sıkıştırılmış algılama teorisi için ne olurdu?

(Tabii ki, sıkıştırılmış algılama uygulamalarında doğrusallık ve seyreklik önemli konulardır. Burada sorduğum soru daha "felsefi" dir.)

ds.algorithms ds.data-structures compressed-sensing

— arnab
kaynak

21

Sorunuz adresleri "kesin" kurtarma sorunu (biz k-seyrek kurtarmak istediğiniz tam verilen ). Ben "sağlam" versiyonunda, üzerinde durulacak olsa Aşağıda keyfi bir vektör ve kurtarma algoritmasının hedeftir bulmaktır bir -sparse yaklaşım için (bu ayrım aslında aşağıda tartışma bazıları için önemli olan ). Resmen aşağıdaki sorunu takip etmek istersiniz ( $x$ $Ax$ $x$ $k$ $x'$ $x$ $P_1$ ):

Tasarım tür biri için yapılacak one kurtarabilirsiniz nerede $A$ $x$ $x'$ $\|x-x'\|_L \le$

, ki burada tüm sparse vektörleriüzerinde değişir. $\min_{x"} C \|x-x"\|_R$ $x"$ $k$

Şimdi bazı analoglar ve genellemeler.

Keyfi temeli. İlk olarak, yukarıdaki tanımlamayı sağlayan herhangi bir şemanın daha genel bir problemi çözmek için kullanılabileceğini gözlemleyin, burada geri kazanılmış sinyal , sadece standart olandan değil, rasgele bir temelde (örneğin, Fourier dalgacıklarından) seyrek görülür. Let baz matris. Biçimsel olarak, bir vektör bir baz olarak -sparse ise burada bir -sparse. Şimdi genelleşmiş problemi düşünebiliriz ( ): $x'$ $B$ $u$ $k$ $B$ $u=Bv$ $v$ $k$ $P_B$

Tasarım gibi verilen , tek bir kurtarabilir burada $A_B$ $A_B x$ $x'$ $\|x-x'\|_L \le$

, ki burada de sparse olantüm vektörlerin üzerindedir . $\min_{x"} C \|x-x"\|_R$ $x"$ $k$ $B$

Bir önceki problemi, bu sorunun azaltılması için , diğer bir deyişle temel değişen bir ölçüm matrisi kullanılarak . Biz bir çözüm varsa de norm (yani sol ve sağ normlar için eşit ), biz de bir çözüm olsun içinde norm. Eğer kullanımları diğer normlar, biz çözmek $P_1$ $A_B = A B^{-1}$ $P_1$ $\ell_2$ $\ell_2$ $P_B$ $\ell_2$ $P_1$ $P_B$ temelini değiştirerek modifiye bu normlara.

Yukarıdaki bir uyarı, yukarıdaki yaklaşımda, tanımlamak için matrisini bilmemiz gerektiğidir . Biz randomizasyon izin verirse Belki ilginç bir şekilde ( sabit değildir, bunun yerine, rastgele seçilmiş), tercih mümkündür bağımsız olarak, sabit bir dağılımdan . Bu, sözde evrensellik özelliğidir. $B$ $A_B$ $A_B$ $A_B$ $B$

Sözlükler. Bir sonraki genelleme, bir temel olduğu gerekliliği düşerek elde edilebilir . Bunun yerine, sütunlardan daha fazla satır olmasına izin verebiliriz . Bu matrisler (tamamlanmamış) sözlükler olarak adlandırılır. Popüler bir örnek, Fourier matrisinin üstündeki kimlik matrisidir. Başka bir örnek, satırların {1 ... n} 'deki tüm aralıkların karakteristik vektörleri olduğu bir matristir; Bu durumda, set { }, tüm " histogramları", yani en çok ile {1 ... n} üzerindeki parçacıklı sabit fonksiyonları içerir. $B$ $B$ $Bu: \mbox{u is k-sparse}$ $k$ $k$ parçasına .

Bildiğim kadarıyla, bu konuda keyfi bir çalışma olmasına rağmen, bu tür keyfi sözlükler için genel bir teori yoktur. Örneğin, Candes-Eldar-Needell'10 veya Donoho-Elad-Temlyakov, IEEE Bilgi Teorisi İşlemleri, 2004 .

Histogramlar için eskiz, akış ve veri tabanı literatüründe kapsamlı bir şekilde araştırıldı; örneğin, Gilbert-Guha-Indyk-Kotidis-Muthukrishnan-Strauss, STOC 2002 veya Thaper-Guha-Indyk-Koudas, SIGMOD 2002 .

Modelleri. (ayrıca Arnab tarafından da belirtilmiştir). Farklı bir genelleme, seyreklik kalıpları üzerinde kısıtlamalar getirmektir. , {1 ... n} ' nin alt gruplarının bir alt kümesi olsun . Biz söylemek olan desteği ise -sparse bir unsuru dahildir . Artık sorunu (diyoruz oluşturabilir ): $M$ $k$ $u$ $M$ $u$ $M$ $P_M$

Tasarım tür biri için yapılacak one kurtarabilirsiniz nerede $A$ $x$ $x'$ $\|x-x'\|_L \le$

, ki burada tüm seyrek vektörlerdedeğişir. $\min_{x"} C \|x-x"\|_R$ $x"$ $M$

Örneğin, unsurları formunun olabilir her biri, bir uzunluk arasında bir "alt-blok" tekabül {1 ... n} , yani olan bazı için {jb + 1 ... (j + 1) b} biçiminde . Bu sözde "blok seyrekliği" modelidir. $M$ $I_1 \cup \ldots \cup I_k$ $I_i$ $b$ $I_i$ $j$

The benefits of models is that one can save on the number of measurements, compared to the generic $k$ -sparsity approach. This is because the space of $M$ -sparse signals is smaller than the space of all $k$ -sparse signals, so the matrix $A$ needs to preserve less information. For more, see Baraniuk-Cevher-Duarte-Hegde, IEEE Transactions on Information Theory, 2010 or Eldar-Mishali, IEEE Transactions on Information Theory, 2009.

Hope this helps.

— Piotr
kaynak

11

There is a generalization of compressed sensing to the non-commutative setting called matrix completion. In the exact setting, you are given an unknown $m \times n$ matrix $M$ which, instead of sparsity, is known to have low rank $r \ll m,n$ . Your goal is to reconstruct the $r$ singular values and singular vectors of this matrix by sampling only $\tilde{O}(rm+rn)$ coefficients of the matrix, rather than $O(mn)$ as required in the worst case.

If the singular vectors are sufficiently "incoherent" (roughly, not too well aligned) with the basis in which you are sampling matrix elements, then you can succeed with high probability by solving a convex program, similar to standard compressed sensing. In this case, you have to minimize the Schatten 1-norm, i.e. the sum of the singular values.

This problem also has lots of applications, for example, to giving book recommendations to a customer of an online book store from knowing only the few ratings that other customers have generated. In this context, the rows and columns of $M$ are labeled by the books and the customers, respectively. The few visible matrix elements are the customer ratings of the books they previously bought. The matrix $M$ is expected to be low rank because we believe that typically only a few primary factors influence our preferences. By completing $M$ , the vendor can make accurate predictions about which books you are likely to want.

A good start is this paper by Candés and Recht, Exact Matrix Completion via Convex Optimization. There is also a really cool generalization where you are allowed to sample in an arbitrary basis for the matrix space. This paper by David Gross, Recovering low-rank matrices from few coefficients in any basis uses this generalization to substantially simplify the proofs of matrix completion, and for some bases you can remove the incoherence assumption as well. That paper also contains the best bounds to date on the sampling complexity. It may sound strange to sample in an arbitrary basis, but it is actually quite natural in the setting of quantum mechanics, see for example this paper, Quantum state tomography via compressed sensing.

— Steve Flammia
kaynak

9

There is manifold-based compressed sensing, in which the sparsity condition is replaced by the condition that the data lie on a low-dimensional submanifold of the natural space of signals. Note that sparsity can be phrased as lying on a particular manifold (in fact, a secant variety).

See, for example this paper and the references in its introduction. (I admittedly do not know if this paper is representative of the area -- I am more familiar with the related topic of manifold-based classifiers a la Niyogi-Smale-Weinberger.)

— Joshua Grochow
kaynak

interesting paper. I wasn't aware of this work.

— Suresh Venkat

incidentally, as Candes pointed out in his SODA 10 invited talk, sparsity is not the same as being low-dimensional. it's quite easy to have one without the other

— Suresh Venkat

Thanks! One interesting work cited by the linked paper is "Model-based compressive sensing". It shows, I think, that the number of measurements can be reduced even more than in regular CS if the input signal is promised to come from some small set of K-dimensional subspaces.

— arnab

8

I suppose that, at the level of generality in which I've posed the question, the paper "Compression of samplable sources" by Trevisan, Vadhan and Zuckerman (2004) also qualifies as one possible answer. They show that in many cases, if the source of input strings is of low complexity (e.g., samplable by logspace machines), then one can compress, and decompress, in polynomial time to length an additive constant away from the entropy of the source.

I don't really know though if compressed sensing can be put into some larger theory of compression.

— arnab
kaynak

3

One analog of compressive sensing is in machine learning when you try to estimate a high dimensional weight vector (e.g., in classification/regression) from a very small sample size. To deal with underdetermined systems of linear equations in such settings, one typically enforces sparsity (via l0 or l1 penalty) on the weight vector being learned. To see the connection, consider the following classification/regression problem from machine learning:

Represent the N examples of D dimensions each (D >> N) as an NxD matrix X. Represent the N responses (one for each example) as an Nx1 vector Y. The goal is to solve for a Dx1 vector theta via the following equation: Y = X*theta

Now here is the analogy of this problem to compressive sensing (CS): you want to estimate/measure theta which is a D dimensional vector (akin to an unknown "signal" in CS). To estimate this, you use a matrix X (akin to the design matrix in CS) and N 1-D measurements Y (akin to the compressed signal in CS, since D >> N).

— spinxl39
kaynak

2

See: http://www.damtp.cam.ac.uk/user/na/people/Anders/Inf_CS43.pdf