Using Regex To Pass Syntax-valid C++ Declaration/initialization
Solution 1:
No regex in the world will be powerful enough to parse C++ declarations, for the very simple reason that the grammar is severely context-sensitive (and, in all likelihood, is actually undecidable).
For example, using the IsPrime template defined here, you can write a declaration like
int a = foo<IsPrime<234799>>::typen<1>();
which is syntactically valid if and only if 234799 is prime.
Consider using a different approach to validate C++ (e.g. g++ -fsyntax-only).
Solution 2:
As nneonneo mentioned, regex is not suitable for the task, but if you want to match the sample strings you have, you can use this:
^(?:\s*[A-Za-z_][A-Za-z0-9]*\s*(?:=\s*(?:[A-Za-z0-9]+(?:[+\/*-][A-Za-z0-9]+)?|"[^"]*"|'[^']*'))?\s*,)*\s*[A-Za-z_][A-Za-z0-9]*\s*(?:=\s*(?:[A-Za-z0-9]+(?:[+\/*-][A-Za-z0-9]+)?|"[^"]*"|'[^']*'))?\s*;
Couple of things I changed from your regex:
Changed
[A-z]to[A-Za-z].Put the
=\s*'outside' because it was quite repetitive.Added square brackets to the bare
0-9. I believe it was meant to be a character class.Added letters to the character class
[0-9].Changed all the
[^]to[^"]and[^']where appropriate. I'm not too sure what you were trying, but just in case.Added the basic integer operators and digits (and letters for variables) following it
(?:[+/*-][A-Za-z0-9]+)?.Changed the
*in the first chacter class after=to+to prevent immediate,after=.
EDIT:
^(?:\s*[A-Za-z_][A-Za-z0-9_]*\s*(?:=\s*(?:[A-Za-z0-9_]+(?:\s*[+\/*-]\s*[A-Za-z0-9_]+)*|[0-9]+(?:\.[0-9]+)?(?:\s*[+\/*-]\s*[0-9]+(?:\.[0-9]+)?)+|"[^"]*"|'[^']*'))?\s*,)*\s*[A-Za-z_][A-Za-z0-9_]*\s*(?:=\s*(?:[A-Za-z0-9_]+(?:\s*[+\/*-]\s*[A-Za-z0-9_]+)*|[0-9]+(?:\.[0-9]+)?(?:\s*[+\/*-]\s*[0-9]+(?:\.[0-9]+)?)+|"[^"]*"|'[^']*'))?\s*;$
Some more whitespaces allowed and allowed underscore in variable names.
Post a Comment for "Using Regex To Pass Syntax-valid C++ Declaration/initialization"