Is it safe to let a user type a regex as a search input?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
78
down vote

favorite
13












I was in a mall a few days ago and I searched for a shop on an indication panel.



Out of curiosity, I tried a search with (.+) and was a bit surprised to get the list of all the shops in the mall.



I've read a bit about evil regexes but it seems that this kind of attack can only happen when the attacker has both control of the entry to search and the search input (the regex).



Can we consider the mall indication panel safe from DOS considering that the attacker only has control of the search input? (Leaving aside the possibility that a shop might be called some weird name like aaaaaaaaaaaa.)







share|improve this question

















  • 24




    If the user can enter a regex, and there's an interpreted language in use, I wouldn't be worried about DOS; I'd be worried about code injection.
    – gowenfawr
    2 days ago






  • 79




    I would not expect a mall map to be designed for sophisticated users that might use regexes. Therefore, if regexes work, it suggests the application is sort of blindly passing the input string in. That's usually a place to try various forms of code and SQL injection. It's that little voice saying "I bet they didn't do that by design..." that makes the antenna perk up. This is a Comment, not an Answer, because (for me) there's not enough info here to say anything more accurate than that.
    – gowenfawr
    2 days ago







  • 11




    Despite the security concerns I would love to perform RegEx filtration in indication panels of huge shopping malls!
    – Daniel
    2 days ago






  • 21




    Did you test any regex that should get matches to determine it actually used regex? If I were to design a mall search I would list all shops if the search result was empty. Either the user is trying to have fun (like you) and the result would not matter or the user isn't good at using the search functionality and they should see something that might be of use to them.
    – Bent
    2 days ago







  • 27




    It is also possible that the search field just ignores any punctuation, and is programmed to return all shops for an essentially empty query.
    – jpa
    2 days ago
















up vote
78
down vote

favorite
13












I was in a mall a few days ago and I searched for a shop on an indication panel.



Out of curiosity, I tried a search with (.+) and was a bit surprised to get the list of all the shops in the mall.



I've read a bit about evil regexes but it seems that this kind of attack can only happen when the attacker has both control of the entry to search and the search input (the regex).



Can we consider the mall indication panel safe from DOS considering that the attacker only has control of the search input? (Leaving aside the possibility that a shop might be called some weird name like aaaaaaaaaaaa.)







share|improve this question

















  • 24




    If the user can enter a regex, and there's an interpreted language in use, I wouldn't be worried about DOS; I'd be worried about code injection.
    – gowenfawr
    2 days ago






  • 79




    I would not expect a mall map to be designed for sophisticated users that might use regexes. Therefore, if regexes work, it suggests the application is sort of blindly passing the input string in. That's usually a place to try various forms of code and SQL injection. It's that little voice saying "I bet they didn't do that by design..." that makes the antenna perk up. This is a Comment, not an Answer, because (for me) there's not enough info here to say anything more accurate than that.
    – gowenfawr
    2 days ago







  • 11




    Despite the security concerns I would love to perform RegEx filtration in indication panels of huge shopping malls!
    – Daniel
    2 days ago






  • 21




    Did you test any regex that should get matches to determine it actually used regex? If I were to design a mall search I would list all shops if the search result was empty. Either the user is trying to have fun (like you) and the result would not matter or the user isn't good at using the search functionality and they should see something that might be of use to them.
    – Bent
    2 days ago







  • 27




    It is also possible that the search field just ignores any punctuation, and is programmed to return all shops for an essentially empty query.
    – jpa
    2 days ago












up vote
78
down vote

favorite
13









up vote
78
down vote

favorite
13






13





I was in a mall a few days ago and I searched for a shop on an indication panel.



Out of curiosity, I tried a search with (.+) and was a bit surprised to get the list of all the shops in the mall.



I've read a bit about evil regexes but it seems that this kind of attack can only happen when the attacker has both control of the entry to search and the search input (the regex).



Can we consider the mall indication panel safe from DOS considering that the attacker only has control of the search input? (Leaving aside the possibility that a shop might be called some weird name like aaaaaaaaaaaa.)







share|improve this question













I was in a mall a few days ago and I searched for a shop on an indication panel.



Out of curiosity, I tried a search with (.+) and was a bit surprised to get the list of all the shops in the mall.



I've read a bit about evil regexes but it seems that this kind of attack can only happen when the attacker has both control of the entry to search and the search input (the regex).



Can we consider the mall indication panel safe from DOS considering that the attacker only has control of the search input? (Leaving aside the possibility that a shop might be called some weird name like aaaaaaaaaaaa.)









share|improve this question












share|improve this question




share|improve this question








edited 2 days ago
























asked 2 days ago









Xavier59

1,5182525




1,5182525







  • 24




    If the user can enter a regex, and there's an interpreted language in use, I wouldn't be worried about DOS; I'd be worried about code injection.
    – gowenfawr
    2 days ago






  • 79




    I would not expect a mall map to be designed for sophisticated users that might use regexes. Therefore, if regexes work, it suggests the application is sort of blindly passing the input string in. That's usually a place to try various forms of code and SQL injection. It's that little voice saying "I bet they didn't do that by design..." that makes the antenna perk up. This is a Comment, not an Answer, because (for me) there's not enough info here to say anything more accurate than that.
    – gowenfawr
    2 days ago







  • 11




    Despite the security concerns I would love to perform RegEx filtration in indication panels of huge shopping malls!
    – Daniel
    2 days ago






  • 21




    Did you test any regex that should get matches to determine it actually used regex? If I were to design a mall search I would list all shops if the search result was empty. Either the user is trying to have fun (like you) and the result would not matter or the user isn't good at using the search functionality and they should see something that might be of use to them.
    – Bent
    2 days ago







  • 27




    It is also possible that the search field just ignores any punctuation, and is programmed to return all shops for an essentially empty query.
    – jpa
    2 days ago












  • 24




    If the user can enter a regex, and there's an interpreted language in use, I wouldn't be worried about DOS; I'd be worried about code injection.
    – gowenfawr
    2 days ago






  • 79




    I would not expect a mall map to be designed for sophisticated users that might use regexes. Therefore, if regexes work, it suggests the application is sort of blindly passing the input string in. That's usually a place to try various forms of code and SQL injection. It's that little voice saying "I bet they didn't do that by design..." that makes the antenna perk up. This is a Comment, not an Answer, because (for me) there's not enough info here to say anything more accurate than that.
    – gowenfawr
    2 days ago







  • 11




    Despite the security concerns I would love to perform RegEx filtration in indication panels of huge shopping malls!
    – Daniel
    2 days ago






  • 21




    Did you test any regex that should get matches to determine it actually used regex? If I were to design a mall search I would list all shops if the search result was empty. Either the user is trying to have fun (like you) and the result would not matter or the user isn't good at using the search functionality and they should see something that might be of use to them.
    – Bent
    2 days ago







  • 27




    It is also possible that the search field just ignores any punctuation, and is programmed to return all shops for an essentially empty query.
    – jpa
    2 days ago







24




24




If the user can enter a regex, and there's an interpreted language in use, I wouldn't be worried about DOS; I'd be worried about code injection.
– gowenfawr
2 days ago




If the user can enter a regex, and there's an interpreted language in use, I wouldn't be worried about DOS; I'd be worried about code injection.
– gowenfawr
2 days ago




79




79




I would not expect a mall map to be designed for sophisticated users that might use regexes. Therefore, if regexes work, it suggests the application is sort of blindly passing the input string in. That's usually a place to try various forms of code and SQL injection. It's that little voice saying "I bet they didn't do that by design..." that makes the antenna perk up. This is a Comment, not an Answer, because (for me) there's not enough info here to say anything more accurate than that.
– gowenfawr
2 days ago





I would not expect a mall map to be designed for sophisticated users that might use regexes. Therefore, if regexes work, it suggests the application is sort of blindly passing the input string in. That's usually a place to try various forms of code and SQL injection. It's that little voice saying "I bet they didn't do that by design..." that makes the antenna perk up. This is a Comment, not an Answer, because (for me) there's not enough info here to say anything more accurate than that.
– gowenfawr
2 days ago





11




11




Despite the security concerns I would love to perform RegEx filtration in indication panels of huge shopping malls!
– Daniel
2 days ago




Despite the security concerns I would love to perform RegEx filtration in indication panels of huge shopping malls!
– Daniel
2 days ago




21




21




Did you test any regex that should get matches to determine it actually used regex? If I were to design a mall search I would list all shops if the search result was empty. Either the user is trying to have fun (like you) and the result would not matter or the user isn't good at using the search functionality and they should see something that might be of use to them.
– Bent
2 days ago





Did you test any regex that should get matches to determine it actually used regex? If I were to design a mall search I would list all shops if the search result was empty. Either the user is trying to have fun (like you) and the result would not matter or the user isn't good at using the search functionality and they should see something that might be of use to them.
– Bent
2 days ago





27




27




It is also possible that the search field just ignores any punctuation, and is programmed to return all shops for an essentially empty query.
– jpa
2 days ago




It is also possible that the search field just ignores any punctuation, and is programmed to return all shops for an essentially empty query.
– jpa
2 days ago










5 Answers
5






active

oldest

votes

















up vote
69
down vote



accepted










I would compare accepting user supplied regular expressions to parsing most sorts of structured user input, such as date strings or markdown, in terms of risk of code execution. Regular expressions are much more complex than date strings or markdown (although safely producing html from untrusted markdown has its own risks) and so represents more room for exploitation, but the basic principle is the same: exploitation involves finding unexpected side effects of the parsing/compilation/matching process.



Most regex libraries are mature and part of the standard library in many languages, which is a pretty good (but not certain) indicator that it's free of major issues leading to code execution.
That is to say, it does increase your attack surface, but it's not unreasonable to make the measured decision to accept that relatively minor risk.



Denial of service attacks are a little trickier. I think most regular expression libraries are designed with performance in mind but do not count mitigation of intentionally slow input among their core design goals. The appropriateness of accepting user supplied regular expressions from the DoS perspective is more library dependent.
For example, the .NET regex library accepts a timeout which could be used to mitigate DoS attacks.
RE2 guarantees execution in time linear to input size which may be acceptable if you know your search corpus falls within some reasonable size limit.



In situations where availability is absolutely critical or you're trying to minimize your attack surface as much as possible it makes sense to avoid accepting user regex, but I think it's a defensible practice.






share|improve this answer



















  • 6




    Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
    – Bob
    2 days ago






  • 6




    @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
    – Boris the Spider
    2 days ago






  • 1




    Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
    – JimmyJames
    2 days ago






  • 3




    @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
    – Boris the Spider
    2 days ago







  • 4




    Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
    – Philipp
    2 days ago


















up vote
13
down vote













The main threat in accepting regular expressions will be in your regex execution engine rather than accepting regex itself. I'd expect the threat to be very, very low in any well implemented engine. The engine shouldn't need access to any privileged system resources and should only need to run logic on input provided directly to the engine. This means that even if someone finds an exploit in the interpreter, the damage that can be done should be minimal.



Overall, all regex is designed to do is look for patterns within a value. As long as proper security is followed on the values you check against, there is no reason the engine itself should have any access to modify values. I'd classify it as generally pretty safe.



That said, I'd also only provide it in situations where it made reasonable sense to do so. Regex is complex, potentially time consuming to run, and used in the wrong places could have some undesirable impacts on an application outside of a security context, but in the right use case they are hugely powerful and immensely valuable. (I'm a software architect who refactors hundreds of thousands of lines of code regularly using regex.)






share|improve this answer

















  • 13




    This doesn't cover DoS attacks via, for example, catastrophic backtracking.
    – Boris the Spider
    2 days ago






  • 4




    @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
    – AJ Henderson
    2 days ago







  • 1




    People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
    – eis
    yesterday










  • @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
    – AJ Henderson
    yesterday










  • @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
    – eis
    yesterday

















up vote
7
down vote













As the other answers have pointed out, the attack vector would most possibly be the regex engine.



While you would assume that these engines are quite mature, robust and thoroughly tested, it did happen in the past:



CVE-2010-1792 Arbitrary Code Execution in Apple Safari and iOS.
Quote from the Patch notes:




A memory corruption issue exists in WebKit's handling
of regular expressions. Visiting a maliciously crafted website may
lead to an unexpected application termination or arbitrary code
execution.




But of course, the argument of a possibly flawed library holds for everything - even user-provided JPEG files.



The other aspect, albeit not inherently technical, would be the (.+) case you mentioned: Should the product allow arbitrary data retrieval?






share|improve this answer






























    up vote
    7
    down vote













    The problem is that regex engines "backtrack". When you have a reptition operation (e.g. + or * ) in your regex the regex engine will try to match it against as much of the input string as possible. If the match later fails then it will backtrack and try matching your repition against a smaller part of the input string.



    Multiple repitition operations can lead to nested backtracking and this can lead to the time to evaluate the regex blowing up massively, especially if the repetition operators are nested.



    https://www.regular-expressions.info/catastrophic.html






    share|improve this answer




























      up vote
      2
      down vote













      No, ReDoS does not require the attacker to craft unnatural search results.



      The basic idea of ReDoS is that you have a sub-expression that can match in multiple ways and matches almost everywhere in the searched string except the end, and you iterate that sub-expression to get catastrophic backtracking. So for example if your shop description is Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua., you can just use something like ([^q]|[^q][^q])+ (or more complex constructs with e.g. lookaheads).



      Whether that's a problem depends - as other answers have explained, you can just limit the time available to the regex engine.






      share|improve this answer





















      • I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
        – Taemyr
        9 hours ago










      • RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
        – Tgr
        5 hours ago










      Your Answer







      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "162"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      noCode: true, onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );








       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f191017%2fis-it-safe-to-let-a-user-type-a-regex-as-a-search-input%23new-answer', 'question_page');

      );

      Post as a guest






























      5 Answers
      5






      active

      oldest

      votes








      5 Answers
      5






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      69
      down vote



      accepted










      I would compare accepting user supplied regular expressions to parsing most sorts of structured user input, such as date strings or markdown, in terms of risk of code execution. Regular expressions are much more complex than date strings or markdown (although safely producing html from untrusted markdown has its own risks) and so represents more room for exploitation, but the basic principle is the same: exploitation involves finding unexpected side effects of the parsing/compilation/matching process.



      Most regex libraries are mature and part of the standard library in many languages, which is a pretty good (but not certain) indicator that it's free of major issues leading to code execution.
      That is to say, it does increase your attack surface, but it's not unreasonable to make the measured decision to accept that relatively minor risk.



      Denial of service attacks are a little trickier. I think most regular expression libraries are designed with performance in mind but do not count mitigation of intentionally slow input among their core design goals. The appropriateness of accepting user supplied regular expressions from the DoS perspective is more library dependent.
      For example, the .NET regex library accepts a timeout which could be used to mitigate DoS attacks.
      RE2 guarantees execution in time linear to input size which may be acceptable if you know your search corpus falls within some reasonable size limit.



      In situations where availability is absolutely critical or you're trying to minimize your attack surface as much as possible it makes sense to avoid accepting user regex, but I think it's a defensible practice.






      share|improve this answer



















      • 6




        Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
        – Bob
        2 days ago






      • 6




        @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
        – Boris the Spider
        2 days ago






      • 1




        Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
        – JimmyJames
        2 days ago






      • 3




        @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
        – Boris the Spider
        2 days ago







      • 4




        Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
        – Philipp
        2 days ago















      up vote
      69
      down vote



      accepted










      I would compare accepting user supplied regular expressions to parsing most sorts of structured user input, such as date strings or markdown, in terms of risk of code execution. Regular expressions are much more complex than date strings or markdown (although safely producing html from untrusted markdown has its own risks) and so represents more room for exploitation, but the basic principle is the same: exploitation involves finding unexpected side effects of the parsing/compilation/matching process.



      Most regex libraries are mature and part of the standard library in many languages, which is a pretty good (but not certain) indicator that it's free of major issues leading to code execution.
      That is to say, it does increase your attack surface, but it's not unreasonable to make the measured decision to accept that relatively minor risk.



      Denial of service attacks are a little trickier. I think most regular expression libraries are designed with performance in mind but do not count mitigation of intentionally slow input among their core design goals. The appropriateness of accepting user supplied regular expressions from the DoS perspective is more library dependent.
      For example, the .NET regex library accepts a timeout which could be used to mitigate DoS attacks.
      RE2 guarantees execution in time linear to input size which may be acceptable if you know your search corpus falls within some reasonable size limit.



      In situations where availability is absolutely critical or you're trying to minimize your attack surface as much as possible it makes sense to avoid accepting user regex, but I think it's a defensible practice.






      share|improve this answer



















      • 6




        Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
        – Bob
        2 days ago






      • 6




        @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
        – Boris the Spider
        2 days ago






      • 1




        Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
        – JimmyJames
        2 days ago






      • 3




        @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
        – Boris the Spider
        2 days ago







      • 4




        Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
        – Philipp
        2 days ago













      up vote
      69
      down vote



      accepted







      up vote
      69
      down vote



      accepted






      I would compare accepting user supplied regular expressions to parsing most sorts of structured user input, such as date strings or markdown, in terms of risk of code execution. Regular expressions are much more complex than date strings or markdown (although safely producing html from untrusted markdown has its own risks) and so represents more room for exploitation, but the basic principle is the same: exploitation involves finding unexpected side effects of the parsing/compilation/matching process.



      Most regex libraries are mature and part of the standard library in many languages, which is a pretty good (but not certain) indicator that it's free of major issues leading to code execution.
      That is to say, it does increase your attack surface, but it's not unreasonable to make the measured decision to accept that relatively minor risk.



      Denial of service attacks are a little trickier. I think most regular expression libraries are designed with performance in mind but do not count mitigation of intentionally slow input among their core design goals. The appropriateness of accepting user supplied regular expressions from the DoS perspective is more library dependent.
      For example, the .NET regex library accepts a timeout which could be used to mitigate DoS attacks.
      RE2 guarantees execution in time linear to input size which may be acceptable if you know your search corpus falls within some reasonable size limit.



      In situations where availability is absolutely critical or you're trying to minimize your attack surface as much as possible it makes sense to avoid accepting user regex, but I think it's a defensible practice.






      share|improve this answer















      I would compare accepting user supplied regular expressions to parsing most sorts of structured user input, such as date strings or markdown, in terms of risk of code execution. Regular expressions are much more complex than date strings or markdown (although safely producing html from untrusted markdown has its own risks) and so represents more room for exploitation, but the basic principle is the same: exploitation involves finding unexpected side effects of the parsing/compilation/matching process.



      Most regex libraries are mature and part of the standard library in many languages, which is a pretty good (but not certain) indicator that it's free of major issues leading to code execution.
      That is to say, it does increase your attack surface, but it's not unreasonable to make the measured decision to accept that relatively minor risk.



      Denial of service attacks are a little trickier. I think most regular expression libraries are designed with performance in mind but do not count mitigation of intentionally slow input among their core design goals. The appropriateness of accepting user supplied regular expressions from the DoS perspective is more library dependent.
      For example, the .NET regex library accepts a timeout which could be used to mitigate DoS attacks.
      RE2 guarantees execution in time linear to input size which may be acceptable if you know your search corpus falls within some reasonable size limit.



      In situations where availability is absolutely critical or you're trying to minimize your attack surface as much as possible it makes sense to avoid accepting user regex, but I think it's a defensible practice.







      share|improve this answer















      share|improve this answer



      share|improve this answer








      edited yesterday









      Jan Doggen

      91921021




      91921021











      answered 2 days ago









      Ryan Jenkins

      66166




      66166







      • 6




        Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
        – Bob
        2 days ago






      • 6




        @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
        – Boris the Spider
        2 days ago






      • 1




        Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
        – JimmyJames
        2 days ago






      • 3




        @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
        – Boris the Spider
        2 days ago







      • 4




        Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
        – Philipp
        2 days ago













      • 6




        Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
        – Bob
        2 days ago






      • 6




        @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
        – Boris the Spider
        2 days ago






      • 1




        Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
        – JimmyJames
        2 days ago






      • 3




        @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
        – Boris the Spider
        2 days ago







      • 4




        Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
        – Philipp
        2 days ago








      6




      6




      Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
      – Bob
      2 days ago




      Yes, a timeout is the first thing that comes to mind for mitigating a DoS. Even ignoring library support, it's fairly trivial in most languages/frameworks to spin off the search to a background thread, and have a timeout against that thread.
      – Bob
      2 days ago




      6




      6




      @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
      – Boris the Spider
      2 days ago




      @Bob that's trivial yes, but stopping the background task is not. For example in a language like Java there is no way to forcibly terminate a thread, so even if your timeout had expired you would not be able to do anything about it.
      – Boris the Spider
      2 days ago




      1




      1




      Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
      – JimmyJames
      2 days ago




      Ages ago when I became aware of regex and moved beyond the basics to start getting fancy, I was able to create some really horrifically slow regex patterns. A lot of this depends on the regex engine but if you are working with one that supports backreferences, lookaheads/lookbehind and/or greedy quantifiers, it's not too hard to bog things down. Of course the length of the strings you are searching makes a big difference. Multi-line regex on large documents can really be a dog.
      – JimmyJames
      2 days ago




      3




      3




      @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
      – Boris the Spider
      2 days ago





      @Nat it relies on cooperative multitasking - i.e. it will cancel(true) the task, which will interrupt() the Thread - if the task is interruptible then this may work, most likely it won't however.
      – Boris the Spider
      2 days ago





      4




      4




      Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
      – Philipp
      2 days ago





      Here is an example of a regular expression which takes exponential execution times on Java: (0*)*A
      – Philipp
      2 days ago













      up vote
      13
      down vote













      The main threat in accepting regular expressions will be in your regex execution engine rather than accepting regex itself. I'd expect the threat to be very, very low in any well implemented engine. The engine shouldn't need access to any privileged system resources and should only need to run logic on input provided directly to the engine. This means that even if someone finds an exploit in the interpreter, the damage that can be done should be minimal.



      Overall, all regex is designed to do is look for patterns within a value. As long as proper security is followed on the values you check against, there is no reason the engine itself should have any access to modify values. I'd classify it as generally pretty safe.



      That said, I'd also only provide it in situations where it made reasonable sense to do so. Regex is complex, potentially time consuming to run, and used in the wrong places could have some undesirable impacts on an application outside of a security context, but in the right use case they are hugely powerful and immensely valuable. (I'm a software architect who refactors hundreds of thousands of lines of code regularly using regex.)






      share|improve this answer

















      • 13




        This doesn't cover DoS attacks via, for example, catastrophic backtracking.
        – Boris the Spider
        2 days ago






      • 4




        @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
        – AJ Henderson
        2 days ago







      • 1




        People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
        – eis
        yesterday










      • @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
        – AJ Henderson
        yesterday










      • @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
        – eis
        yesterday














      up vote
      13
      down vote













      The main threat in accepting regular expressions will be in your regex execution engine rather than accepting regex itself. I'd expect the threat to be very, very low in any well implemented engine. The engine shouldn't need access to any privileged system resources and should only need to run logic on input provided directly to the engine. This means that even if someone finds an exploit in the interpreter, the damage that can be done should be minimal.



      Overall, all regex is designed to do is look for patterns within a value. As long as proper security is followed on the values you check against, there is no reason the engine itself should have any access to modify values. I'd classify it as generally pretty safe.



      That said, I'd also only provide it in situations where it made reasonable sense to do so. Regex is complex, potentially time consuming to run, and used in the wrong places could have some undesirable impacts on an application outside of a security context, but in the right use case they are hugely powerful and immensely valuable. (I'm a software architect who refactors hundreds of thousands of lines of code regularly using regex.)






      share|improve this answer

















      • 13




        This doesn't cover DoS attacks via, for example, catastrophic backtracking.
        – Boris the Spider
        2 days ago






      • 4




        @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
        – AJ Henderson
        2 days ago







      • 1




        People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
        – eis
        yesterday










      • @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
        – AJ Henderson
        yesterday










      • @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
        – eis
        yesterday












      up vote
      13
      down vote










      up vote
      13
      down vote









      The main threat in accepting regular expressions will be in your regex execution engine rather than accepting regex itself. I'd expect the threat to be very, very low in any well implemented engine. The engine shouldn't need access to any privileged system resources and should only need to run logic on input provided directly to the engine. This means that even if someone finds an exploit in the interpreter, the damage that can be done should be minimal.



      Overall, all regex is designed to do is look for patterns within a value. As long as proper security is followed on the values you check against, there is no reason the engine itself should have any access to modify values. I'd classify it as generally pretty safe.



      That said, I'd also only provide it in situations where it made reasonable sense to do so. Regex is complex, potentially time consuming to run, and used in the wrong places could have some undesirable impacts on an application outside of a security context, but in the right use case they are hugely powerful and immensely valuable. (I'm a software architect who refactors hundreds of thousands of lines of code regularly using regex.)






      share|improve this answer













      The main threat in accepting regular expressions will be in your regex execution engine rather than accepting regex itself. I'd expect the threat to be very, very low in any well implemented engine. The engine shouldn't need access to any privileged system resources and should only need to run logic on input provided directly to the engine. This means that even if someone finds an exploit in the interpreter, the damage that can be done should be minimal.



      Overall, all regex is designed to do is look for patterns within a value. As long as proper security is followed on the values you check against, there is no reason the engine itself should have any access to modify values. I'd classify it as generally pretty safe.



      That said, I'd also only provide it in situations where it made reasonable sense to do so. Regex is complex, potentially time consuming to run, and used in the wrong places could have some undesirable impacts on an application outside of a security context, but in the right use case they are hugely powerful and immensely valuable. (I'm a software architect who refactors hundreds of thousands of lines of code regularly using regex.)







      share|improve this answer













      share|improve this answer



      share|improve this answer











      answered 2 days ago









      AJ Henderson

      39k554105




      39k554105







      • 13




        This doesn't cover DoS attacks via, for example, catastrophic backtracking.
        – Boris the Spider
        2 days ago






      • 4




        @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
        – AJ Henderson
        2 days ago







      • 1




        People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
        – eis
        yesterday










      • @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
        – AJ Henderson
        yesterday










      • @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
        – eis
        yesterday












      • 13




        This doesn't cover DoS attacks via, for example, catastrophic backtracking.
        – Boris the Spider
        2 days ago






      • 4




        @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
        – AJ Henderson
        2 days ago







      • 1




        People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
        – eis
        yesterday










      • @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
        – AJ Henderson
        yesterday










      • @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
        – eis
        yesterday







      13




      13




      This doesn't cover DoS attacks via, for example, catastrophic backtracking.
      – Boris the Spider
      2 days ago




      This doesn't cover DoS attacks via, for example, catastrophic backtracking.
      – Boris the Spider
      2 days ago




      4




      4




      @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
      – AJ Henderson
      2 days ago





      @boris I didn't consider that a security threat as expensive regex handling in a non interfering manner is necessary even in normal usage. People are going to make excessively complex regex statements without it being an attack plenty often. Rational timeouts is a necessary design decision for performance reasons, not just security. It would be a bit like saying a security risk of adding a complex report is people may DOS your site by running the report. That's a performance concern, not a security one.
      – AJ Henderson
      2 days ago





      1




      1




      People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
      – eis
      yesterday




      People have crashed servers with regular expressions, and I personally know of one site that had hundreds of thousands of users getting crashed with that kind of a construct. Can't agree with such damage being minimal, as it took them some time to get it back online.
      – eis
      yesterday












      @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
      – AJ Henderson
      yesterday




      @eis did they exploit the regex engine or was performance safe guards not properly configured and a series of run away regex took down the server trying to solve? I said the risk of exploitation of the engine is low. Slow running queries, even in a dos sense, is a performance concern as legitimate queries could also take down the server without proper performance safe guards.
      – AJ Henderson
      yesterday












      @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
      – eis
      yesterday




      @AJHenderson you're right in that it's the latter, not about exploiting the engine. However even without any exploit I think the end user impact might be something else than minimal, even if the regex won't modify any values.
      – eis
      yesterday










      up vote
      7
      down vote













      As the other answers have pointed out, the attack vector would most possibly be the regex engine.



      While you would assume that these engines are quite mature, robust and thoroughly tested, it did happen in the past:



      CVE-2010-1792 Arbitrary Code Execution in Apple Safari and iOS.
      Quote from the Patch notes:




      A memory corruption issue exists in WebKit's handling
      of regular expressions. Visiting a maliciously crafted website may
      lead to an unexpected application termination or arbitrary code
      execution.




      But of course, the argument of a possibly flawed library holds for everything - even user-provided JPEG files.



      The other aspect, albeit not inherently technical, would be the (.+) case you mentioned: Should the product allow arbitrary data retrieval?






      share|improve this answer



























        up vote
        7
        down vote













        As the other answers have pointed out, the attack vector would most possibly be the regex engine.



        While you would assume that these engines are quite mature, robust and thoroughly tested, it did happen in the past:



        CVE-2010-1792 Arbitrary Code Execution in Apple Safari and iOS.
        Quote from the Patch notes:




        A memory corruption issue exists in WebKit's handling
        of regular expressions. Visiting a maliciously crafted website may
        lead to an unexpected application termination or arbitrary code
        execution.




        But of course, the argument of a possibly flawed library holds for everything - even user-provided JPEG files.



        The other aspect, albeit not inherently technical, would be the (.+) case you mentioned: Should the product allow arbitrary data retrieval?






        share|improve this answer

























          up vote
          7
          down vote










          up vote
          7
          down vote









          As the other answers have pointed out, the attack vector would most possibly be the regex engine.



          While you would assume that these engines are quite mature, robust and thoroughly tested, it did happen in the past:



          CVE-2010-1792 Arbitrary Code Execution in Apple Safari and iOS.
          Quote from the Patch notes:




          A memory corruption issue exists in WebKit's handling
          of regular expressions. Visiting a maliciously crafted website may
          lead to an unexpected application termination or arbitrary code
          execution.




          But of course, the argument of a possibly flawed library holds for everything - even user-provided JPEG files.



          The other aspect, albeit not inherently technical, would be the (.+) case you mentioned: Should the product allow arbitrary data retrieval?






          share|improve this answer















          As the other answers have pointed out, the attack vector would most possibly be the regex engine.



          While you would assume that these engines are quite mature, robust and thoroughly tested, it did happen in the past:



          CVE-2010-1792 Arbitrary Code Execution in Apple Safari and iOS.
          Quote from the Patch notes:




          A memory corruption issue exists in WebKit's handling
          of regular expressions. Visiting a maliciously crafted website may
          lead to an unexpected application termination or arbitrary code
          execution.




          But of course, the argument of a possibly flawed library holds for everything - even user-provided JPEG files.



          The other aspect, albeit not inherently technical, would be the (.+) case you mentioned: Should the product allow arbitrary data retrieval?







          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited 2 days ago









          Xavier59

          1,5182525




          1,5182525











          answered 2 days ago









          PhilLab

          1713




          1713




















              up vote
              7
              down vote













              The problem is that regex engines "backtrack". When you have a reptition operation (e.g. + or * ) in your regex the regex engine will try to match it against as much of the input string as possible. If the match later fails then it will backtrack and try matching your repition against a smaller part of the input string.



              Multiple repitition operations can lead to nested backtracking and this can lead to the time to evaluate the regex blowing up massively, especially if the repetition operators are nested.



              https://www.regular-expressions.info/catastrophic.html






              share|improve this answer

























                up vote
                7
                down vote













                The problem is that regex engines "backtrack". When you have a reptition operation (e.g. + or * ) in your regex the regex engine will try to match it against as much of the input string as possible. If the match later fails then it will backtrack and try matching your repition against a smaller part of the input string.



                Multiple repitition operations can lead to nested backtracking and this can lead to the time to evaluate the regex blowing up massively, especially if the repetition operators are nested.



                https://www.regular-expressions.info/catastrophic.html






                share|improve this answer























                  up vote
                  7
                  down vote










                  up vote
                  7
                  down vote









                  The problem is that regex engines "backtrack". When you have a reptition operation (e.g. + or * ) in your regex the regex engine will try to match it against as much of the input string as possible. If the match later fails then it will backtrack and try matching your repition against a smaller part of the input string.



                  Multiple repitition operations can lead to nested backtracking and this can lead to the time to evaluate the regex blowing up massively, especially if the repetition operators are nested.



                  https://www.regular-expressions.info/catastrophic.html






                  share|improve this answer













                  The problem is that regex engines "backtrack". When you have a reptition operation (e.g. + or * ) in your regex the regex engine will try to match it against as much of the input string as possible. If the match later fails then it will backtrack and try matching your repition against a smaller part of the input string.



                  Multiple repitition operations can lead to nested backtracking and this can lead to the time to evaluate the regex blowing up massively, especially if the repetition operators are nested.



                  https://www.regular-expressions.info/catastrophic.html







                  share|improve this answer













                  share|improve this answer



                  share|improve this answer











                  answered 2 days ago









                  Peter Green

                  3,77111421




                  3,77111421




















                      up vote
                      2
                      down vote













                      No, ReDoS does not require the attacker to craft unnatural search results.



                      The basic idea of ReDoS is that you have a sub-expression that can match in multiple ways and matches almost everywhere in the searched string except the end, and you iterate that sub-expression to get catastrophic backtracking. So for example if your shop description is Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua., you can just use something like ([^q]|[^q][^q])+ (or more complex constructs with e.g. lookaheads).



                      Whether that's a problem depends - as other answers have explained, you can just limit the time available to the regex engine.






                      share|improve this answer





















                      • I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
                        – Taemyr
                        9 hours ago










                      • RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
                        – Tgr
                        5 hours ago














                      up vote
                      2
                      down vote













                      No, ReDoS does not require the attacker to craft unnatural search results.



                      The basic idea of ReDoS is that you have a sub-expression that can match in multiple ways and matches almost everywhere in the searched string except the end, and you iterate that sub-expression to get catastrophic backtracking. So for example if your shop description is Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua., you can just use something like ([^q]|[^q][^q])+ (or more complex constructs with e.g. lookaheads).



                      Whether that's a problem depends - as other answers have explained, you can just limit the time available to the regex engine.






                      share|improve this answer





















                      • I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
                        – Taemyr
                        9 hours ago










                      • RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
                        – Tgr
                        5 hours ago












                      up vote
                      2
                      down vote










                      up vote
                      2
                      down vote









                      No, ReDoS does not require the attacker to craft unnatural search results.



                      The basic idea of ReDoS is that you have a sub-expression that can match in multiple ways and matches almost everywhere in the searched string except the end, and you iterate that sub-expression to get catastrophic backtracking. So for example if your shop description is Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua., you can just use something like ([^q]|[^q][^q])+ (or more complex constructs with e.g. lookaheads).



                      Whether that's a problem depends - as other answers have explained, you can just limit the time available to the regex engine.






                      share|improve this answer













                      No, ReDoS does not require the attacker to craft unnatural search results.



                      The basic idea of ReDoS is that you have a sub-expression that can match in multiple ways and matches almost everywhere in the searched string except the end, and you iterate that sub-expression to get catastrophic backtracking. So for example if your shop description is Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua., you can just use something like ([^q]|[^q][^q])+ (or more complex constructs with e.g. lookaheads).



                      Whether that's a problem depends - as other answers have explained, you can just limit the time available to the regex engine.







                      share|improve this answer













                      share|improve this answer



                      share|improve this answer











                      answered 10 hours ago









                      Tgr

                      519210




                      519210











                      • I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
                        – Taemyr
                        9 hours ago










                      • RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
                        – Tgr
                        5 hours ago
















                      • I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
                        – Taemyr
                        9 hours ago










                      • RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
                        – Tgr
                        5 hours ago















                      I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
                      – Taemyr
                      9 hours ago




                      I would mention that there is regexp implementations that does not do backtracking - and those avoids this problem.
                      – Taemyr
                      9 hours ago












                      RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
                      – Tgr
                      5 hours ago




                      RE2 is already mentioned in another answer. It's not really an implementation though, it's a safe subset of the language - so you'd lose features compared to something like PCRE (arguably features that no one cares about in a product search form).
                      – Tgr
                      5 hours ago












                       

                      draft saved


                      draft discarded


























                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f191017%2fis-it-safe-to-let-a-user-type-a-regex-as-a-search-input%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Comments

                      Popular posts from this blog

                      What is the equation of a 3D cone with generalised tilt?

                      Color the edges and diagonals of a regular polygon

                      Relationship between determinant of matrix and determinant of adjoint?