본문 바로가기
나를 위한 코드

뷰티풀숩 속도 높이기

by 라나나 2023. 3. 14.
728x90

 

 

html.parser 보다 lxml 이 빠르다.

 

 

chatgpt 한테 물어봤다. 어떻게 해야 뷰티풀숩 속도를 높일 수 있는지

 

대답

  1. lxml 사용하기, Use the lxml library: The lxml library is faster than the default parser (html.parser) in BeautifulSoup. You can install lxml using pip: pip install lxml. To use lxml as the parser, pass 'lxml' as the second argument to the BeautifulSoup constructor.
  2. select 대신 find 사용하기, Use find instead of select: The find method is faster than the select method for finding elements in the HTML tree. You can use the find method to get the first element that matches a specific tag name or attribute.
  3. 찾을 부분만 제한두기, Limit the search scope: If you're looking for a specific element, limit the search scope to its parent element. This way, BeautifulSoup doesn't have to search the entire HTML tree.
  4. Use a faster parser: BeautifulSoup supports multiple parsers, including lxml, html5lib, and html.parser. You can try using a faster parser to see if it speeds up the parsing process.
  5. 정규표현식 안 쓰기, Avoid using regex: Regex is slow in general, so try to avoid using regex with BeautifulSoup if possible.
  6. Use a cache: If you're making multiple requests to the same page, you can cache the BeautifulSoup object to avoid having to parse the page again.
  7. concurrent.futures 모듈써서 멀티프로세스 만들기, Parallelize: If you're processing multiple pages, you can use the concurrent.futures module in Python to parse multiple pages in parallel.

 

728x90

댓글