Lowest price ever! Learn Generative AI for 48% less!
Get my discount+ 2
The problem is the '^' and '#x27; in the regexp strings returned from the getters of the Grammars class. Let's take the example of Grammars.startTag(). The regex string is
`^<(@?${this.#identifier})( ${this.#attrs})*>
(because you're adding '^' and '#x27; before passing it to the RegExp constructor)
If I take the string, "<h1>Hello</h1>". This start tag will match with the regex, but the '^' and '#x27; metacharacters require "<h1>" to be the start and end of the string, which it is not in this case.
The same issue is with all the getters of the Grammers class.
Also, in the regexp returned from Grammers.emptyTag(), you forgot to consider the '@' character before the tag name.
This will fix your current test cases, but you might also want to add "\s*" here and there in the regexp objects, as you don't know the user's choice of whitespace.
+ 1
Also, I understand that this might just be a code for practising regex, but using regex for HTML parsing is generally not a good idea. It is very hard to make the regex as forgiving as the HTML engines used by browsers.
The general way to do HTML parsing is to make a HTML tokenizer that goes character by character in the code deciding what to do with each character and splits the HTML into tokens and then a parser that goes token by token and constructs the DOM tree. W3 has a detailed spec on how to tokenize and parse html documents. See
https://www.w3.org/TR/2011/WD-html5-20110525/parsing.html