Mean time between failures for exponential distribution.

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












Let's say I have n independent machines that fail according to independent exponential distributions with mean of 1000 days. The machines can be shut down by the operator, so some of the samples we observe will be censored. For example, if a machine was started today morning and shut down in the afternoon, with no failures in between, we can only say the mean time between failures was more than 12 hours, not exactly how much more. So, we have some sampled observations, $t_i$ where we observe actual inter-arrival times between failures and some censored observations, $x_j$ where we only observe that the failure took more than this time. How do we get the best estimate for the rate, $lambda$? For this, we can use maximum likelihood.



$$L(lambda)=Pi_1^n f_T(t_i) Pi_1^mF_T(x_j)$$



Where $f_T(t)$ is the PDF of the exponential distribution and $F_T(t)$ is the survival function. The lok-likelihood then becomes:



$$ll(lambda) = sum_1^n (log(lambda)-lambda t_i) +sum_1^m-lambda x_j$$
Differentiating with respect to $lambda$ and setting to zero we get:



$$frac1lambda = fracsum t_i +sum x_jn$$



Where $frac1lambda$ is the MTBF. Here, $n$ is the number of sampled observations. In other words, instances where we actually witnessed the time between two failures.



This is where things seem to fall apart. Let's say there is a single machine that fails every 1000 days. I run it continuously from 3000 days and hence see 3 failures. From the final formula above, the $n$ should be 2 since I only saw it fail consecutively twice (day 1000 to day 2000 and day 2000 to day 3000). The first failure is a censored observation. However, this would give an MTBF of 3000/2 = 1500 which is wrong.



Now consider another scenario. Suppose I have 2000 such machines, but can observe them only within the span of a day. If I take any random day, I'll need to sample 2000 machines on average before I observe two failures (and then $n$ = 1). So, the MTBF for this combined system is 1 day/ 1 sample between failures = 1. Since this is for the system of 2000 machines, the MTBF for a single machine must be 1*2000. This is a factor of two off.



Note in both of these examples, if I take $n$ to simply be the number of failures, everything works out. But that doesn't seem to be it's definition. What am I missing here?







share|cite|improve this question























    up vote
    1
    down vote

    favorite












    Let's say I have n independent machines that fail according to independent exponential distributions with mean of 1000 days. The machines can be shut down by the operator, so some of the samples we observe will be censored. For example, if a machine was started today morning and shut down in the afternoon, with no failures in between, we can only say the mean time between failures was more than 12 hours, not exactly how much more. So, we have some sampled observations, $t_i$ where we observe actual inter-arrival times between failures and some censored observations, $x_j$ where we only observe that the failure took more than this time. How do we get the best estimate for the rate, $lambda$? For this, we can use maximum likelihood.



    $$L(lambda)=Pi_1^n f_T(t_i) Pi_1^mF_T(x_j)$$



    Where $f_T(t)$ is the PDF of the exponential distribution and $F_T(t)$ is the survival function. The lok-likelihood then becomes:



    $$ll(lambda) = sum_1^n (log(lambda)-lambda t_i) +sum_1^m-lambda x_j$$
    Differentiating with respect to $lambda$ and setting to zero we get:



    $$frac1lambda = fracsum t_i +sum x_jn$$



    Where $frac1lambda$ is the MTBF. Here, $n$ is the number of sampled observations. In other words, instances where we actually witnessed the time between two failures.



    This is where things seem to fall apart. Let's say there is a single machine that fails every 1000 days. I run it continuously from 3000 days and hence see 3 failures. From the final formula above, the $n$ should be 2 since I only saw it fail consecutively twice (day 1000 to day 2000 and day 2000 to day 3000). The first failure is a censored observation. However, this would give an MTBF of 3000/2 = 1500 which is wrong.



    Now consider another scenario. Suppose I have 2000 such machines, but can observe them only within the span of a day. If I take any random day, I'll need to sample 2000 machines on average before I observe two failures (and then $n$ = 1). So, the MTBF for this combined system is 1 day/ 1 sample between failures = 1. Since this is for the system of 2000 machines, the MTBF for a single machine must be 1*2000. This is a factor of two off.



    Note in both of these examples, if I take $n$ to simply be the number of failures, everything works out. But that doesn't seem to be it's definition. What am I missing here?







    share|cite|improve this question





















      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      Let's say I have n independent machines that fail according to independent exponential distributions with mean of 1000 days. The machines can be shut down by the operator, so some of the samples we observe will be censored. For example, if a machine was started today morning and shut down in the afternoon, with no failures in between, we can only say the mean time between failures was more than 12 hours, not exactly how much more. So, we have some sampled observations, $t_i$ where we observe actual inter-arrival times between failures and some censored observations, $x_j$ where we only observe that the failure took more than this time. How do we get the best estimate for the rate, $lambda$? For this, we can use maximum likelihood.



      $$L(lambda)=Pi_1^n f_T(t_i) Pi_1^mF_T(x_j)$$



      Where $f_T(t)$ is the PDF of the exponential distribution and $F_T(t)$ is the survival function. The lok-likelihood then becomes:



      $$ll(lambda) = sum_1^n (log(lambda)-lambda t_i) +sum_1^m-lambda x_j$$
      Differentiating with respect to $lambda$ and setting to zero we get:



      $$frac1lambda = fracsum t_i +sum x_jn$$



      Where $frac1lambda$ is the MTBF. Here, $n$ is the number of sampled observations. In other words, instances where we actually witnessed the time between two failures.



      This is where things seem to fall apart. Let's say there is a single machine that fails every 1000 days. I run it continuously from 3000 days and hence see 3 failures. From the final formula above, the $n$ should be 2 since I only saw it fail consecutively twice (day 1000 to day 2000 and day 2000 to day 3000). The first failure is a censored observation. However, this would give an MTBF of 3000/2 = 1500 which is wrong.



      Now consider another scenario. Suppose I have 2000 such machines, but can observe them only within the span of a day. If I take any random day, I'll need to sample 2000 machines on average before I observe two failures (and then $n$ = 1). So, the MTBF for this combined system is 1 day/ 1 sample between failures = 1. Since this is for the system of 2000 machines, the MTBF for a single machine must be 1*2000. This is a factor of two off.



      Note in both of these examples, if I take $n$ to simply be the number of failures, everything works out. But that doesn't seem to be it's definition. What am I missing here?







      share|cite|improve this question











      Let's say I have n independent machines that fail according to independent exponential distributions with mean of 1000 days. The machines can be shut down by the operator, so some of the samples we observe will be censored. For example, if a machine was started today morning and shut down in the afternoon, with no failures in between, we can only say the mean time between failures was more than 12 hours, not exactly how much more. So, we have some sampled observations, $t_i$ where we observe actual inter-arrival times between failures and some censored observations, $x_j$ where we only observe that the failure took more than this time. How do we get the best estimate for the rate, $lambda$? For this, we can use maximum likelihood.



      $$L(lambda)=Pi_1^n f_T(t_i) Pi_1^mF_T(x_j)$$



      Where $f_T(t)$ is the PDF of the exponential distribution and $F_T(t)$ is the survival function. The lok-likelihood then becomes:



      $$ll(lambda) = sum_1^n (log(lambda)-lambda t_i) +sum_1^m-lambda x_j$$
      Differentiating with respect to $lambda$ and setting to zero we get:



      $$frac1lambda = fracsum t_i +sum x_jn$$



      Where $frac1lambda$ is the MTBF. Here, $n$ is the number of sampled observations. In other words, instances where we actually witnessed the time between two failures.



      This is where things seem to fall apart. Let's say there is a single machine that fails every 1000 days. I run it continuously from 3000 days and hence see 3 failures. From the final formula above, the $n$ should be 2 since I only saw it fail consecutively twice (day 1000 to day 2000 and day 2000 to day 3000). The first failure is a censored observation. However, this would give an MTBF of 3000/2 = 1500 which is wrong.



      Now consider another scenario. Suppose I have 2000 such machines, but can observe them only within the span of a day. If I take any random day, I'll need to sample 2000 machines on average before I observe two failures (and then $n$ = 1). So, the MTBF for this combined system is 1 day/ 1 sample between failures = 1. Since this is for the system of 2000 machines, the MTBF for a single machine must be 1*2000. This is a factor of two off.



      Note in both of these examples, if I take $n$ to simply be the number of failures, everything works out. But that doesn't seem to be it's definition. What am I missing here?









      share|cite|improve this question










      share|cite|improve this question




      share|cite|improve this question









      asked Aug 5 at 22:59









      Rohit Pandey

      798718




      798718

























          active

          oldest

          votes











          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "69"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2873410%2fmean-time-between-failures-for-exponential-distribution%23new-answer', 'question_page');

          );

          Post as a guest



































          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes










           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2873410%2fmean-time-between-failures-for-exponential-distribution%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          What is the equation of a 3D cone with generalised tilt?

          Color the edges and diagonals of a regular polygon

          Relationship between determinant of matrix and determinant of adjoint?