c# - efficient powerset algorithm for subsets of minimal length -
i using following c# function powerset limited subsets of minimal length
string[] powerset(int min_len, string set) { ienumerable<ienumerable<string>> seed = new list<ienumerable<string>>() { enumerable.empty<string>() }; return set.replace(" ", "") .split(',') .aggregate(seed, (a, b) => a.concat(a.select(x => x.concat(new[] { b })))) .where(subset => subset.count() >= min_len) .select(subset => string.join(",", subset)) .toarray(); }
the problem when original set large, algorithm has work hard if minimal length large well.
e.g:
powerset(27, "1,11,12,17,22,127,128,135,240,254,277,284,292,296,399,309,322,326,333,439,440,442,447,567,580,590,692,697");
should easy, lengthily above function. looking concise modification of function efficiently handle these cases.
taking quick @ method, 1 of inefficiencies every possible subset created, regardless of whether has enough members warrant inclusion in limited super set.
consider implementing following extension method instead. method can trim out unnecessary subsets based on count avoid excess computation.
public static list<list<t>> powerset<t>(list<t> startingset, int minsubsetsize) { list<list<t>> subsetlist = new list<list<t>>(); //the set bits of each intermediate value represent unique //combinations startingset. //we can start checking combinations @ (1<<minsubsetsize)-1 since //values less not yield large enough subsets. int ilimit = 1 << startingset.count; (int = (1 << minsubsetsize)-1; < ilimit; i++) { //get number of 1's in 'i' int setbitcount = numberofsetbits(i); //only include subset if have @ least minsubsetsize members. if (setbitcount >= minsubsetsize) { list<t> subset = new list<t>(setbitcount); (int j = 0; j < startingset.count; j++) { //if j'th bit in set, //then add j'th element of startingset subset. if ((i & (1 << j)) != 0) { subset.add(startingset[j]); } } subsetlist.add(subset); } } return subsetlist; }
the number of set bits in each incremental i
tells how many members in subset. if there not enough set bits, there no point in doing work of creating subset represented bit combination. numberofsetbits
can implemented number of ways. see how count number of set bits in 32-bit integer? various approaches, explanations , references. here 1 example taken question.
public static int numberofsetbits(int i) { = - ((i >> 1) & 0x55555555); = (i & 0x33333333) + ((i >> 2) & 0x33333333); return (((i + (i >> 4)) & 0x0f0f0f0f) * 0x01010101) >> 24; }
now, while solution works example, think run long runtimes , memory issues if lower minimum subset size far or continue grow size of startingset
. without specific requirements posted in question, can't judge if solution work and/or safe range of expected input cases.
if find solution still slow, operations can split parallel computation, perhaps using plinq features.
lastly, if dress extension method linq, following. however, written, think see slower performance without changes it.
public static ienumerable<list<t>> powerset<t>(list<t> startingset, int minsubsetsize) { var startingsetindexes = enumerable.range(0, startingset.count).tolist(); var candidates = enumerable.range((1 << minsubsetsize)-1, 1 << startingset.count) .where(p => numberofsetbits(p) >= minsubsetsize) .tolist(); foreach (int p in candidates) { yield return startingsetindexes.where(setind => (p & (1 << setind)) != 0) .select(setind => startingset[setind]) .tolist(); } }
Comments
Post a Comment