Optimization problem where the support of a random variable depends on the value of a decision variable
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.
probability optimization random-variables markov-process dynamic-programming
add a comment |Â
up vote
1
down vote
favorite
Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.
probability optimization random-variables markov-process dynamic-programming
1
Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
â Brian Borchers
Jul 15 at 0:40
I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
â PJORR
Jul 15 at 0:59
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.
probability optimization random-variables markov-process dynamic-programming
Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.
probability optimization random-variables markov-process dynamic-programming
edited Jul 17 at 2:03
asked Jul 14 at 22:42
PJORR
62
62
1
Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
â Brian Borchers
Jul 15 at 0:40
I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
â PJORR
Jul 15 at 0:59
add a comment |Â
1
Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
â Brian Borchers
Jul 15 at 0:40
I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
â PJORR
Jul 15 at 0:59
1
1
Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
â Brian Borchers
Jul 15 at 0:40
Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
â Brian Borchers
Jul 15 at 0:40
I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
â PJORR
Jul 15 at 0:59
I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
â PJORR
Jul 15 at 0:59
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2852018%2foptimization-problem-where-the-support-of-a-random-variable-depends-on-the-value%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
â Brian Borchers
Jul 15 at 0:40
I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
â PJORR
Jul 15 at 0:59