Novo curso! Todo programador deveria aprender IA generativa!
Experimente uma aula grƔtis0
How can I get rid of apostrophes in a txt file?
I have a txt file and I'm trying to get a list which contain all the words in it. But I need to get rid of apostrophes, dots, commas etc. I used strip function for commas, dots and indentations. And I used replace for apostrophes. But it didn't work. I don't understand why? Because it normally works on strings. Here is my code. I can also provide txt file if you guys need it. https://code.sololearn.com/cN8sPyyJUU5z/?ref=app
11 Respostas
+ 6
You can try this, and you can easy adapt to more character:
https://code.sololearn.com/c4Z8hOVAMO6j/?ref=app
+ 5
It would be very helpful if you could just copy some lines from your input text BEFOR it is treated. And may be you can also give us a sample how it should look like.
+ 2
If i was given this task...I'd be using regular expressions i.e. "re.findall"
Use pattern "[a-zA-Z0-9]+"
Much much easier than they way your trying to do in your code.
edit.....?? you code does actually work anyway...what problem are you getting?
+ 2
Regex is the first choice. If not, then you could make a function that takes the strings to be removed as parameters and use .replace() for each parameter. Something like this
https://code.sololearn.com/c0h95U1GcMWy/?ref=app
+ 2
Lothar this is a part of my text "Dostoyevski Rusyaāda yaÅanan siyasi ve ekonomik olaylar sonrasında gƶzlemlediÄi hayatlardan esinlenerek 1866 yılında yazdıÄı eser ilk olarak Rus Habercisi isimli edebiyat dergisinde yayınlanmıÅtır. BĆ¼yĆ¼k beÄeni toplayan eser daha sonra kitap haline getirilmiÅ ve o gĆ¼nden beri birƧok kitap ve sinema filmine konu olmuÅtur. SuƧ ve Ceza Dostoyevskiānin baÅyapıtı sayılır."
I'm trying to get a list like this:
["Dostoyevski", "Rusya", "da", "yaÅanan"............]
+ 2
import re
class File:
def __init__(self):
with open("file.txt", "r", encoding="utf-8") as file:
self.naked_words = re.findall(r"\w+", file.read()) # < edited to \w+
myfile = File()
print(myfile.naked_words)
+ 2
Lothar Awesome. It literally worked. Thanks a lot.š
+ 2
Baran Aldemir my code worked. It was not working in your case, because you never told it to replace ā . Just pass whatever you want to replace as arguements in the function (strip_params) call (in line 10).
+ 2
Haha I'm so dumb. I've just realized my main problem was the difference between (') (ā) symbols. On keyboard I guess we don't have (ā) this? XXX That's why I couldn't make your code work.
+ 1
Thank you guys for your responses. rodwynnejones I'll check regex subject. I don't know that subject yet. XXX I implement your function to my file but somehow it doesn't work. Apostrophes are still remaining.
+ 1
rodwynnejones It worked at some level. But there are some other consequences š
probably because of turkish characters. I should probably read about the regex first. I appreciate all the efforts. Thanks again.