0
Please help here, why the loop is breaking soon is not known to me.
The Link https://sololearn.com/compiler-playground/cZSD0FRqeMkp/?ref=app
5 odpowiedzi
+ 2
are you sure it's breaking early?
you have a print inside your while loop at line 23. you might have confused it as part of the result. Comment that out or remove it and run your code again.
23 # print(snt)
your code is unnecessarily complicated btw...
maybe
import re
tags = re.compile(r'<.*?>')
snt = "<abc>bc this is </a> and theat is <p> hi </p>"
# normalize spacing after removing tags
new_snt = ' '.join(tags.sub(' ', snt).split())
print(new_snt)
but using regex is not the recommended way to parse html.
beautiful soup, SGMLparser.regex or lxml.html are just some alternatives...
+ 1
Your loop might be breaking early because of:
1. Wrong loop condition
2. A break statement runs too soon
3. An error (exception) inside the loop
Tip: Add print() statements inside the loop to see what’s happening. If you share your code, I can help fix it directly.
+ 1
UnTentetive ,
it would be very helpful if you could give us a description of what your code should achieve. if possible with input values and expected output.
+ 1
Lothar It prints all the statements inside html tag. It searches start of end tag and end of starting tag, then extracts the words inbetween these 2 tags. After extraction it removes all the tags that were searched and the words inbetween. There is a condition that checks if some words are not wrapped in the tag and adds those words to collection.
0
It's just testing and learning code.