What are the best machine learning libraries for Python ?

I found recently an oDesk job which can match my interest. Client says: “I would like that predictor to be written in Python only, and leverage only publicly-available libraries (mlpy, scipy,scikit etc.)“. Well, it would be good idea to utilize more than one package and check for output F-score, so I googled the most known machine learning packages for Python, there they are:

MLPY and PyML seem to be the most known and mainstream choices. Regarding the list above – Anaconda Python distribution seems to include only scikit-learn package. On the other hand, if your task is connected with NLP only, NLTK package may be enough.

Source: http://www.quora.com/What-are-the-best-open-source-machine-learning-libraries-written-in-Python

LXML – alternative to beautifulsoup

BeautifulSoup is a Python package that parses broken HTML, just like lxml supports it based on the parser of libxml2. BeautifulSoup uses a different parsing approach. It is not a real HTML parser but uses regular expressions to dive through tag soup. It is therefore more forgiving in some cases and less good in others. It is not uncommon that lxml/libxml2 parses and fixes broken HTML better, but BeautifulSoup has superiour support for encoding detection. It very much depends on the input which parser works better.

To prevent users from having to choose their parser library in advance, lxml can interface to the parsing capabilities of BeautifulSoup through the lxml.html.soupparser module. It provides three main functions: fromstring() and parse() to parse a string or file using BeautifulSoup into an lxml.html document, and convert_tree() to convert an existing BeautifulSoup tree into a list of top-level Elements.