The Artima Developer Community
Sponsored Link

Weblogs Forum
Beautiful Soup 4 Now in Beta

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Beautiful Soup 4 Now in Beta (View in Weblogs)
Posted: Feb 23, 2012 6:12 AM
Reply to this message Reply
If you do any HTML/XML processing, you've probably heard of this powerful and useful package. Download the latest beta and run it on your code to help debug it and make sure it does what you need!

Many improvements have been made in this new version -- for one thing, it's compatible with both Python 2 and Python 3. One of the biggest changes is that Beautiful Soup no longer uses its own parser; instead it chooses from what's available on your system, preferring the blazingly-fast lxml but falling back to other parsers and using Python's batteries-included parser if nothing else is available.

A useful tip if you're on Windows: you can find a pre-compiled Windows version of lxml here. That site has lots of pre-compiled Python extensions which is extremely helpful, as some of these packages (like lxml) otherwise require some serious gyrations in order to install them on your Windows box. (I work regularly on Mac, Windows 7 and Ubuntu Linux, in order to ensure that whatever I'm working on is cross-platform.)

Beautiful Soup has been refactored in many places; sometimes these changes constitute a significant improvement to the programming model, other times the changes are just to conform to the Python naming syntax or to ensure Python 2/3 compatibility.

The author Leonard Richardson is open to suggestions for improvements, so if you've had a feature request sitting on your back burner, now's the time.

Here's the introductory link to the Beautiful Soup 4 Beta.

I've been using Beautiful Soup to process a book that I'm coauthoring via Google Docs. We can work on the book remotely at the same time, which is something I've tried to do with other technologies via screen sharing. It works best with Google Docs because there's no setup necessary if we want to have a phone conversation about the document while working on it. Then I download the book in HTML format and apply the Beautiful Soup tools I've written to process the HTML. Although I've spent a fair amount of time on these, the investment is worth it because HTML isn't going away anytime soon so my Beautiful Soup skills should come in handy again and again.

Topic: A Test of Knowledge Previous Topic   Next Topic Topic: Review of My New System: System 76 Gazelle Pro Laptop & Ubuntu 11.10

Sponsored Links


Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use