I'm making my way through Toby Segaran's excellent new book "Programming Collective Intelligence," and I'm posting here some of the errata I've found in the code thus far that hasn't been reported or published on the O'Reilly site yet. I'll report them but also want to explain them here. (I can't get the Python code to indent using the code markup plugin. Please let me know if you have suggestions.)
Chapter 3, Discovering Groups
The main body of this file bombs on
because the URL http://www.techeblog.com/index.php/feed/ toward the bottom of
no long returns an RSS feed. We could remove that URL from feedlist.txt, find the working RSS URL for techeblog, or make our code more robust to deal with this problem in general. To enable the last option, encapsulate getwordcounts in Python's error apparatus:
try: title,wc=getwordcounts(feedurl) except AttributeError: continue
The variable feedlist in the line
is referenced but not initialized or computed before that.
The fix is initialize feedlist and increment it as each feedurl is processed:
feedlist = 0 for feedurl in file('feedlist.txt'): try: title,wc=getwordcounts(feedurl) except AttributeError: continue feedlist += 1 wordcounts[title]=wc for word,count in wc.items(): apcount.setdefault(word,0) if count>1: apcount[word]+=1
Lastly for Chapter 3, the string handling chokes on a character from one of the feeds that doesn't bridge the ascii and unicode worlds. I googled for a solution and came up with this one simple fix:
out = open('blogdata.txt','w') out.write('Blog')
out = codecs.open('blogdata.txt','wb','utf-8') out.write('Blog')
I'm not up to speed on unicode so don't ask me how it works; it works.
That's it for Chapter 3. More later as I make my way through the book. Btw, I just checked Toby's blog and found that you can download the source code.
Thank you for reading this post. You can now Read Comments (3) or Leave A Trackback.
Post InfoThis entry was posted on Friday, December 21st, 2007 and is filed under books, python.
Previous Post: Yelp Battles Supporters of the Meier Family »
Next Post: Devious New Targeted Financial Phishing Scam Strikes Your Cellphone »
Read MoreRelated Reading:
- How to Get MagicJack and Lifecam Cinema Working on Windows 7 64-bit
- Fix for Error Installing do_mysql Datamapper Adapter on Ubuntu
- Devious New Targeted Financial Phishing Scam Strikes Your Cellphone
- Errata for Programming Collective Intelligence
- Yelp Battles Supporters of the Meier Family
- Pictures of Lori Drew
- Picture of Curt Drew
- Brandon Antron Rolle Goes on Trial Today
- Fixing Spurious Rails Routing Error
- MySpace Stumbles Playing Catchup to Facebook with Status Updates “Friendsmoods”