Extract address from webpage by python | Sololearn: Learn to code for FREE!
New course! Every coder should learn Generative AI!
Try a free lesson
0

Extract address from webpage by python

Need to fill the code which is for extracting addresses from a html web page addresses means land mark city pincode mobile/telephone

5th May 2021, 1:37 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
11 Answers
+ 1
use bs4 to scrape and regex to refine results
5th May 2021, 1:40 PM
Slick
Slick - avatar
+ 1
bs4 is beautiful soup. Just the most up to date version i know. regex and requests are both standard in python there is a regex tutorial here on SL. But the requests and bs4 libraries you can receive help with a quick search. from bs4 import BeautifulSoup import requests #basic request: html = requests.get(<url>).text #basic soup obj html_soup = BeautifulSoup(html, "html.parser") print(html_soup) #try it out man
5th May 2021, 1:48 PM
Slick
Slick - avatar
0
I only need to use bs4 and beautifulsoup
5th May 2021, 1:41 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
Can you help me sir with this
5th May 2021, 1:41 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
My teacher said only to import bs4 and pathlib
5th May 2021, 1:50 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
Can i have your contact no or email id
5th May 2021, 1:50 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
I will send you the code
5th May 2021, 1:50 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
I couldn't help out with the pathlib. I'd be learning right along side you haha. I'd just take the oppertunity and get aquainted with pathlib first if i were you (i assume you'll be using that in place of requests). But after you learn the basics of pathlib and you're able to extract html, just pass it over to a BeautifulSoup obj as shown above.
5th May 2021, 1:53 PM
Slick
Slick - avatar
0
What code will you write to extract the addresses only
5th May 2021, 1:55 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
I mean the approach to extract them , addresses means landmark city state pin phone no
5th May 2021, 1:56 PM
CHINMAYA JENA
CHINMAYA JENA - avatar
0
I don't know, never learned pathlib. But the approach can go several ways. in bs4, when fed html, it can parse through and extract specific data, all through tags and other html identifiers. thats how id get addresses. phone numbers is simple, make a phone number regex and extract every phone number in any text. If you search through my code bits you might find my phone number extractor. It scrapes any site and (cant remember how updated the one on here is) saves the results to a file. it may just print them to the screen
5th May 2021, 2:00 PM
Slick
Slick - avatar