'파이썬' 카테고리의 글 목록 소스 코드를 기록하는 남자

from bs4 import BeautifulSoup
import urllib.request

url = 'http://~~'
source_code = urllib.request.urlopen(url).read()
soup = BeautifulSoup(source_code, "html.parser")
for href in soup.find_all('a'):
	print(href.get('href').getText())

it is a code that is collecting all links with tag name 'a' in a page.

before you start a basic crawling, you need to know few methods, like soup.find() , soup.find_all()

find() method needs at least one argument("tag name") to two for crawling.

soup.find() method return a soup object list, for example, find_all() method in

the sample code on the top in this page is return list of all of 'a' tags in the url page.

and you can get text wiht .getText() method.

Because soup.find() and soup.find_all() method return Object class, you can use this object as a iterable variable in

loop statement.

url = 'http://'
source_code = urllib.request.urlopen(url).read()
soup = BeautifulSoup(source_code, "html.parser")

soupObject = soup.find_all('div')
for div in soupObject:
	print(div.getText())

« 2025/04 »

일

월

화

수

목

금

토

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

소스 코드를 기록하는 남자

'파이썬'에 해당되는 글 1건

How to crawl a webPage by python (BeautifulSoup)

티스토리툴바