up vote
1
down vote

favorite

The problem

I have a multiset $M=m_1, dots, m_n$ of positive integers (that is, a number $m_i$ can appear multiple times in $M$), and a positive integer $k$.

I am looking for an algorithm to determine the greatest greatest common divisor (greatest gcd; "greatest greatest" is not a typo) of all sub-multisets that can be derived from $M$ by deleting $k$ elements. Formally,

$$ textgcd_k(M) := max_substack Nsubset M \ #N=k textgcd (Msetminus N), $$

where $N$ is again a multiset.

What I've tried

The case $k=0$ is the "ordinary" $textgcd$, which I'll assume we can calculate easily:

$$ textgcd_0(M) = textgcdbig(textset(M)big), $$

where $textset(M)$ associates to a multiset $M$ its underlying set.

Without loss of generality, we can assume that the multiplicity of each entry in $M$ is greater than $k$, because any element of $M$ that occurs more than $k$ times will not be "active" in determining $textgcd_k(M)$. Formally, if $M'subset M$ consists of those elements of $M$ with multiplicity at most $k$ (with multiplicities, i.e., $M'$ is again a multiset), then

$$ textgcd_k (M) = textgcdBig(bigtextgcd_k(M') cup textgcdbigtextset(Msetminus M'big)Big). $$

We can calculate $textgcd_k(M)$ recursively,

$$ textgcd_k(M) = max_min M textgcd_k-1big(Msetminusmbig). $$

This is what I am doing right now on a dataset, and which is running long enough for me to post this question. I'd prefer something quicker...

Environment

I don't have large numbers. $M$ won't contain much more than 100 numbers, counted with multiplicities, and $k$ won't exceed 10. However, I do need to do this quickly on thousands or even millions of different $M$s.

Why do I care?

I am working on time series that are "almost-multiples" of an underlying $textgcd$, like this one:

time series

These are orders a retail store places at the wholesaler. The underlying multiple is a logistical unit, which I would like to infer from the orders, since it may not be available elsewhere in the system. What complicates matters, and motivates the "leave-$k$-out" aspect, is that sometimes orders are placed which are "not" multiples of this logistical unit.

Disregarding the time dimension for the moment, a table of the values here looks like this (I'll discard the zeros first thing):

$$ beginarray*5c
hline
m_i & 0 & 240 & 432 & 552 & 864 \
hline
#m_i & 705 & 1 & 15 & 1 & 3 \
hline
endarray $$

We can calculate $textgcd_k(M)$ recursively as above and obtain:

$$ beginarrayc
hline
k & 0 & 1 & 2 & 3 & 4 & 5 & 6 \
hline
textgcd_k(M) & 24 & 48 & 432 & 432 & 432 & 432 & 432 \
hline
endarray $$

GCDs

asked Jul 16 at 15:40

Stephan Kolassa

445410

add a commentÂ |Â

up vote
1
down vote

favorite

The problem

I have a multiset $M=m_1, dots, m_n$ of positive integers (that is, a number $m_i$ can appear multiple times in $M$), and a positive integer $k$.

$$ textgcd_k(M) := max_substack Nsubset M \ #N=k textgcd (Msetminus N), $$

where $N$ is again a multiset.

What I've tried

The case $k=0$ is the "ordinary" $textgcd$, which I'll assume we can calculate easily:

$$ textgcd_0(M) = textgcdbig(textset(M)big), $$

where $textset(M)$ associates to a multiset $M$ its underlying set.

Without loss of generality, we can assume that the multiplicity of each entry in $M$ is greater than $k$, because any element of $M$ that occurs more than $k$ times will not be "active" in determining $textgcd_k(M)$. Formally, if $M'subset M$ consists of those elements of $M$ with multiplicity at most $k$ (with multiplicities, i.e., $M'$ is again a multiset), then

$$ textgcd_k (M) = textgcdBig(bigtextgcd_k(M') cup textgcdbigtextset(Msetminus M'big)Big). $$

We can calculate $textgcd_k(M)$ recursively,

$$ textgcd_k(M) = max_min M textgcd_k-1big(Msetminusmbig). $$

This is what I am doing right now on a dataset, and which is running long enough for me to post this question. I'd prefer something quicker...

Environment

Why do I care?

I am working on time series that are "almost-multiples" of an underlying $textgcd$, like this one:

time series

Disregarding the time dimension for the moment, a table of the values here looks like this (I'll discard the zeros first thing):

$$ beginarray*5c
hline
m_i & 0 & 240 & 432 & 552 & 864 \
hline
#m_i & 705 & 1 & 15 & 1 & 3 \
hline
endarray $$

We can calculate $textgcd_k(M)$ recursively as above and obtain:

$$ beginarrayc
hline
k & 0 & 1 & 2 & 3 & 4 & 5 & 6 \
hline
textgcd_k(M) & 24 & 48 & 432 & 432 & 432 & 432 & 432 \
hline
endarray $$

GCDs

asked Jul 16 at 15:40

Stephan Kolassa

445410

add a commentÂ |Â

up vote
1
down vote

favorite

The problem

I have a multiset $M=m_1, dots, m_n$ of positive integers (that is, a number $m_i$ can appear multiple times in $M$), and a positive integer $k$.

$$ textgcd_k(M) := max_substack Nsubset M \ #N=k textgcd (Msetminus N), $$

where $N$ is again a multiset.

What I've tried

The case $k=0$ is the "ordinary" $textgcd$, which I'll assume we can calculate easily:

$$ textgcd_0(M) = textgcdbig(textset(M)big), $$

where $textset(M)$ associates to a multiset $M$ its underlying set.

Without loss of generality, we can assume that the multiplicity of each entry in $M$ is greater than $k$, because any element of $M$ that occurs more than $k$ times will not be "active" in determining $textgcd_k(M)$. Formally, if $M'subset M$ consists of those elements of $M$ with multiplicity at most $k$ (with multiplicities, i.e., $M'$ is again a multiset), then

$$ textgcd_k (M) = textgcdBig(bigtextgcd_k(M') cup textgcdbigtextset(Msetminus M'big)Big). $$

We can calculate $textgcd_k(M)$ recursively,

$$ textgcd_k(M) = max_min M textgcd_k-1big(Msetminusmbig). $$

This is what I am doing right now on a dataset, and which is running long enough for me to post this question. I'd prefer something quicker...

Environment

Why do I care?

I am working on time series that are "almost-multiples" of an underlying $textgcd$, like this one:

time series

Disregarding the time dimension for the moment, a table of the values here looks like this (I'll discard the zeros first thing):

$$ beginarray*5c
hline
m_i & 0 & 240 & 432 & 552 & 864 \
hline
#m_i & 705 & 1 & 15 & 1 & 3 \
hline
endarray $$

We can calculate $textgcd_k(M)$ recursively as above and obtain:

$$ beginarrayc
hline
k & 0 & 1 & 2 & 3 & 4 & 5 & 6 \
hline
textgcd_k(M) & 24 & 48 & 432 & 432 & 432 & 432 & 432 \
hline
endarray $$

GCDs

asked Jul 16 at 15:40

Stephan Kolassa

445410

The problem

I have a multiset $M=m_1, dots, m_n$ of positive integers (that is, a number $m_i$ can appear multiple times in $M$), and a positive integer $k$.

$$ textgcd_k(M) := max_substack Nsubset M \ #N=k textgcd (Msetminus N), $$

where $N$ is again a multiset.

What I've tried

The case $k=0$ is the "ordinary" $textgcd$, which I'll assume we can calculate easily:

$$ textgcd_0(M) = textgcdbig(textset(M)big), $$

where $textset(M)$ associates to a multiset $M$ its underlying set.

Without loss of generality, we can assume that the multiplicity of each entry in $M$ is greater than $k$, because any element of $M$ that occurs more than $k$ times will not be "active" in determining $textgcd_k(M)$. Formally, if $M'subset M$ consists of those elements of $M$ with multiplicity at most $k$ (with multiplicities, i.e., $M'$ is again a multiset), then

$$ textgcd_k (M) = textgcdBig(bigtextgcd_k(M') cup textgcdbigtextset(Msetminus M'big)Big). $$

We can calculate $textgcd_k(M)$ recursively,

$$ textgcd_k(M) = max_min M textgcd_k-1big(Msetminusmbig). $$

This is what I am doing right now on a dataset, and which is running long enough for me to post this question. I'd prefer something quicker...

Environment

Why do I care?

I am working on time series that are "almost-multiples" of an underlying $textgcd$, like this one:

time series

Disregarding the time dimension for the moment, a table of the values here looks like this (I'll discard the zeros first thing):

$$ beginarray*5c
hline
m_i & 0 & 240 & 432 & 552 & 864 \
hline
#m_i & 705 & 1 & 15 & 1 & 3 \
hline
endarray $$

We can calculate $textgcd_k(M)$ recursively as above and obtain:

$$ beginarrayc
hline
k & 0 & 1 & 2 & 3 & 4 & 5 & 6 \
hline
textgcd_k(M) & 24 & 48 & 432 & 432 & 432 & 432 & 432 \
hline
endarray $$

GCDs

asked Jul 16 at 15:40

Stephan Kolassa

445410

asked Jul 16 at 15:40

Stephan Kolassa

445410

asked Jul 16 at 15:40

Stephan Kolassa

445410

asked Jul 16 at 15:40

Stephan Kolassa

445410

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
0
down vote

Are you memoizing $gcd_k$? Because with $|M| = 100$, $k=10$ that should finish rather quickly. Or alternatively, you can build up a dynamic programming table where $G[n][k]$ is $gcd_k$ of the first $n$ numbers (with $G[0][k] = 0$ and by convention $gcd(0, a) = a$). With a recurrence like this:

$$G[n][k] = maxbig(gcd(G[n-1][k], a_n), G[n-1][k-1]big)$$

Again, with $|M|cdot k approx 1000$ that ought to finish really quickly.

Addendum: to consider multiplicities, the $k-1$ (leaving out $a_n$) should be $k - m_n$ (leaving out all $m_n$ copies of $a_n$).

answered Jul 18 at 15:44

Timon Knigge

484

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853516%2fleave-k-out-greatest-common-divisor%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

$$G[n][k] = maxbig(gcd(G[n-1][k], a_n), G[n-1][k-1]big)$$

Again, with $|M|cdot k approx 1000$ that ought to finish really quickly.

Addendum: to consider multiplicities, the $k-1$ (leaving out $a_n$) should be $k - m_n$ (leaving out all $m_n$ copies of $a_n$).

answered Jul 18 at 15:44

Timon Knigge

484

add a commentÂ |Â

up vote
0
down vote

$$G[n][k] = maxbig(gcd(G[n-1][k], a_n), G[n-1][k-1]big)$$

Again, with $|M|cdot k approx 1000$ that ought to finish really quickly.

Addendum: to consider multiplicities, the $k-1$ (leaving out $a_n$) should be $k - m_n$ (leaving out all $m_n$ copies of $a_n$).

answered Jul 18 at 15:44

Timon Knigge

484

add a commentÂ |Â

up vote
0
down vote

$$G[n][k] = maxbig(gcd(G[n-1][k], a_n), G[n-1][k-1]big)$$

Again, with $|M|cdot k approx 1000$ that ought to finish really quickly.

Addendum: to consider multiplicities, the $k-1$ (leaving out $a_n$) should be $k - m_n$ (leaving out all $m_n$ copies of $a_n$).

answered Jul 18 at 15:44

Timon Knigge

484

$$G[n][k] = maxbig(gcd(G[n-1][k], a_n), G[n-1][k-1]big)$$

Again, with $|M|cdot k approx 1000$ that ought to finish really quickly.

Addendum: to consider multiplicities, the $k-1$ (leaving out $a_n$) should be $k - m_n$ (leaving out all $m_n$ copies of $a_n$).

answered Jul 18 at 15:44

Timon Knigge

484

answered Jul 18 at 15:44

Timon Knigge

484

answered Jul 18 at 15:44

Timon Knigge

484

answered Jul 18 at 15:44

Timon Knigge

484

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik

Leave-$k$-out greatest common divisor

The problem

What I've tried

Environment

Why do I care?

The problem

What I've tried

Environment

Why do I care?

The problem

What I've tried

Environment

Why do I care?

The problem

What I've tried

Environment

Why do I care?

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What is the equation of a 3D cone with generalised tilt?

Relationship between determinant of matrix and determinant of adjoint?

Color the edges and diagonals of a regular polygon

Category

Random preview

Leave-$k$-out greatest common divisor

The problem

What I've tried

Environment

Why do I care?

The problem

What I've tried

Environment

Why do I care?

The problem

What I've tried

Environment

Why do I care?

The problem

What I've tried

Environment

Why do I care?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What is the equation of a 3D cone with generalised tilt?

Relationship between determinant of matrix and determinant of adjoint?

Color the edges and diagonals of a regular polygon

1 Answer
1

1 Answer
1

1 Answer
1