İki bölüm arasındaki mesafeyi düzenleme

iki ve aralarındaki düzenleme mesafesini arıyorum. $[1 \ldots n]$

Bu şekilde, bir düğümün A bölümünden B bölümüne gitmek için gerekli olan farklı bir gruba asgari sayıda tek geçiş bulmak istiyorum.

Örneğin mesafe {0 1} {2 3} {4}INTO {0} {1} {2 3 4}olacaktır iki

Aramadan sonra bu makaleye rastladım , ama a) Uzaktaki grupların (umursadığım bir şey) sırasını dikkate aldıklarından emin değilim b) Nasıl çalıştığından emin değilim ve c) Referans yok.

Takdir edilen herhangi bir yardım

ds.algorithms edit-distance lattice

— Zenna
kaynak

{0 1 2 3} ile {0 1} {2 3} arasındaki mesafenin ne olduğunu düşünürdünüz? 2 olur mu? İkincisi, neden "grafikler" in fotoğrafa geldiğini anlamıyorum. Görünüşe göre [n] iki bölümünüz var ve aralarında bir mesafe hesaplamak istiyorsunuz.

— Suresh Venkat

Evet, iki tane olur. Aslında bunlar bir grafiğin düğümleri üzerinde ayarlanmış bölümlerdir (yani bir grafik bölümü). Bu muhtemelen çözüm için önemli değil, ama çözmeye çalıştığım sorun, bu yüzden neden bahsetmiştim.

— zenna

Grafik ilgisizse, lütfen sorunuzdan "grafikler" ve "düğümler" ile ilgili tüm içeriği kaldırın; yardımcı olmaz, dikkat dağıtır.

— Jukka Suomela

Düzenleme mesafesi, bölüm kafesindeki mesafe cinsinden tanımlanamaz mı?

— Tegiri Nenashi

@Tegiri - Gerçekten de bölümlerin kafesindeki jeodezik mesafedir. Ne yazık ki, 10'dan büyük herhangi bir kardinalite seti için bu kafes hesaplamak zor değildir.

— zenna

Yanıtlar:

Bu sorun, maksimum ağırlıklı iki taraflı eşleştirme sorunu olarak da bilinen atama sorununa dönüştürülebilir .

Öncelikle düzenleme mesafesinin bir kümeden diğerine değiştirilmesi gereken eleman sayısına eşit olduğuna dikkat edin. Bu, toplam eleman sayısına eksi değiştirilmesi gerekmeyen eleman sayısına eşittir. Dolayısıyla, değişmeyen minimum eleman sayısını bulmak, değişmeyen maksimum köşe sayısını bulmakla eşdeğerdir.

Let ve Bölme olarak . Ayrıca, genellik kaybı olmadan, ( nedeniyle izin verilir ). Sonra , , ..., tümü boş küme olsun. Ardından, değişmeyen maksimum köşe sayısı: $A = \{ A_1, A_2, ..., A_k \}$ $B = \{ B_1, B_2, ..., B_l \}$ $[1, 2, ..., n]$ $k \ge l$ $edit(A, B) = edit(B, A)$ $B_{l+1}$ $B_{l+2}$ $B_k$

$\max_f \sum_{i=1}^k |A_i \cap B_{f(i)} |$

where $f$ is a permutation of $[1, 2, ..., k]$ .

This is exactly the assignment problem where the vertices are $A_1$ , ..., $A_k$ , $B_1$ , ..., $B_k$ and the edges are pairs $(A_i, B_j)$ with weight $|A_i \cap B_j|$ . This can be solved in $O(|V|^2 \log |V| + |V||E|)$ time.

— bbejot
kaynak

Could you name the algorithm, which gives this time complexity please?

— D-503

I believe @bbejot is referring to the successive shortest path algorithm (with subroutine Dijkstra's implemented using fibonacci heaps).

— Wei

It took me a long time to parse this because I'm not a math person, but thank you. I spent a long time searching and this was the only thing I could find that showed how to convert the partition distance problem to the assignment problem -- or to any algorithm that I could call from some a Python library. (The hard part for me has been figuring out how to use scipy.optimize.linear_sum_assignment and then to set up the matrices based on these instructions.)

— Sigfried

I needed to make the weights negative. Otherwise scipy.optimize.linear_sum_assignment gives me 0 for everything.

— Sigfried

Look at this paper's PDF

http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0030160

The definition of edit distance in there is exactly what you need I think. The 'reference' partition would be (an arbitrary) one of your two partitions, the other would simply be the other one. Also contains relevant citations.

Best, Rob

— Rob
kaynak

Thanks Rob. However, unless I am missing something, this is an edit distance defined in terms of split-merge moves. These are well studied and as the paper points out, the variation of information is a information theoretic measure of this. I am interested however, in single element move transitions.

— zenna

Cranky Sunday morning idea that might or might not be correct:

Wlog, let $P_1$ be the partition with more sets, $P_2$ the other. First, assign pairwise different names $n_1(S) \in \Sigma$ to your sets $P_1$ . Then, find a best naming $n_2(S)$ for the sets $P_2$ by the following rules:

$n_2(S) := n_1(S')$ for $S \in P_2$ with $S \cap S'$ maximal amongst all $S' \in P_1$ ; pick the one creating the least conflicts if multiple choices are possible.
If now $n_2(S) = n_2(S')$ for some $S \neq S'$ , assign the one that shares less elements with $S'', n_1(S'') = n_2(S)$ , the name of the set in $P_1$ it shares the second most elements with, i.e. have it compete for that set's name.
If the former rule can not be applied, check for both sets wether they can compete for the name of other sets they share less elements with (they might still have more elements from some $S'' \in P_1$ than the sets that got assigned its name!). If so, assign that name to the one of $S, S'$ that shares more elements with the respective set whose name they can compete for; the other keeps the formerly conflicting name.
Iterate this procedure until all conflicts are resolved. Since $P_1$ does not have less sets than $P_2$ , there are enough names.

Now, you can consider the bit strings of your elements wrt either partition, i.e. $w_1 = n_1(1) \cdot \dots \cdot n_1(n)$ and $w_2 = n_2(1) \cdot \dots \cdot n_2(n)$ (with $n_j(i) = n_j(S), i \in S \in P_j$ ). Then, the desired quantity is $d_H(w_1, w_2)$ , i.e. the Hamming distance between the bit strings.

— Raphael
kaynak