Optimization problem where the support of a random variable depends on the value of a decision variable

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.







share|cite|improve this question

















  • 1




    Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
    – Brian Borchers
    Jul 15 at 0:40










  • I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
    – PJORR
    Jul 15 at 0:59














up vote
1
down vote

favorite












Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.







share|cite|improve this question

















  • 1




    Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
    – Brian Borchers
    Jul 15 at 0:40










  • I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
    – PJORR
    Jul 15 at 0:59












up vote
1
down vote

favorite









up vote
1
down vote

favorite











Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.







share|cite|improve this question













Suppose we have a dynamic program / Markov decision process where in the objective we have something like this: $v_t(s)=maxlimits_u_1,u_2 r_t(s,u_1,u_2) + E[v_t+1(u_1+min(u_2,epsilon))]$, where $r_t(s,u_1,u_2)$ is a reward function of the state $s$ and the decision variables $u_1, u_2$. $epsilon$ is a discrete random variable and here the support/range of $epsilon$ depends on the value of the decision variable $u_2$, for example, if $u_2$ is 7 then the support of $epsilon$ is $0,1,2,3,4,5,6,7 $. How can we model/write this in a neat non-problematic way. My main problem is how to formulate the problem with this kind of random variable support and decision variable relation. Thanks.









share|cite|improve this question












share|cite|improve this question




share|cite|improve this question








edited Jul 17 at 2:03
























asked Jul 14 at 22:42









PJORR

62




62







  • 1




    Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
    – Brian Borchers
    Jul 15 at 0:40










  • I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
    – PJORR
    Jul 15 at 0:59












  • 1




    Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
    – Brian Borchers
    Jul 15 at 0:40










  • I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
    – PJORR
    Jul 15 at 0:59







1




1




Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
– Brian Borchers
Jul 15 at 0:40




Why not just represent the discrete probability distribution by a vector of probabilites (that are nonnegative and sum to one) and introduce constraints to enforce that the probability of taking on a value is 0 when the decision variables have restricted the support?
– Brian Borchers
Jul 15 at 0:40












I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
– PJORR
Jul 15 at 0:59




I think I got your point. The problem with this is that in this case we will be optimizing the distribution of $epsilon$ since this probability vector will be a variable, which of course is not our aim.
– PJORR
Jul 15 at 0:59















active

oldest

votes











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2852018%2foptimization-problem-where-the-support-of-a-random-variable-depends-on-the-value%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2852018%2foptimization-problem-where-the-support-of-a-random-variable-depends-on-the-value%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

What is the equation of a 3D cone with generalised tilt?

Color the edges and diagonals of a regular polygon

Relationship between determinant of matrix and determinant of adjoint?