뷰티풀숩 속도 높이기

728x90

html.parser 보다 lxml 이 빠르다.

chatgpt 한테 물어봤다. 어떻게 해야 뷰티풀숩 속도를 높일 수 있는지

대답

lxml 사용하기, Use the lxml library: The lxml library is faster than the default parser (html.parser) in BeautifulSoup. You can install lxml using pip: pip install lxml. To use lxml as the parser, pass 'lxml' as the second argument to the BeautifulSoup constructor.
select 대신 find 사용하기, Use find instead of select: The find method is faster than the select method for finding elements in the HTML tree. You can use the find method to get the first element that matches a specific tag name or attribute.
찾을 부분만 제한두기, Limit the search scope: If you're looking for a specific element, limit the search scope to its parent element. This way, BeautifulSoup doesn't have to search the entire HTML tree.
Use a faster parser: BeautifulSoup supports multiple parsers, including lxml, html5lib, and html.parser. You can try using a faster parser to see if it speeds up the parsing process.
정규표현식 안 쓰기, Avoid using regex: Regex is slow in general, so try to avoid using regex with BeautifulSoup if possible.
Use a cache: If you're making multiple requests to the same page, you can cache the BeautifulSoup object to avoid having to parse the page again.
concurrent.futures 모듈써서 멀티프로세스 만들기, Parallelize: If you're processing multiple pages, you can use the concurrent.futures module in Python to parse multiple pages in parallel.

728x90

파이썬 gui : 항목 정리 (0)	2023.03.20
파이썬 gui : 파일, 폴더 찾기 (0)	2023.03.20
pyinstaller 에 내가 원하는 폰트 지정하기 (qt designer) (0)	2023.03.20
파이썬 딕셔너리 KeyError 해결 (0)	2023.03.17
파이썬 엑셀 데이터 입력하기 (xlsx, csv) (0)	2023.03.17

라나