scrape.py

February 27, 2010

At PyCon, I saw a lightning talk about scrape.py, a lightweight Python library for parsing webpages/interacting with them programmatically. For example, finding page elements:

>>> from scrape import *
>>> s.go('http://zesty.ca/')
<Region 0:17780>
>>> d = s.doc
>>> t = d.first('title')
>>> t
<Region 247:258 title>
>>> t.tagname
'title'
>>> t.text
u'Ka-Ping Yee'

The presentation I saw focused on the use case of testing your website. This is definitely a pain point for me personally: I currently either grep the HTML with regexes or I parse the whole thing using ElementTree and use XPath. But there’s still a couple of problems: 1. JS isn’t usually testable this way; 2. you often have to construct your HTML with an eye towards testability. For example, to test pagination, you might need to add a class or id specifying that this is the pagination section and that these pages link to pagination things.

Comments Off

Openhatch, the open source involvement engine

February 27, 2010

One of the neat things I saw at PyCon was a project called OpenHatch, which (among other features) indexes bugs and makes them easy to find according to your skillset and experience level. For example, lots of projects tag bugs as "easy" or "beginner" to promote newbie involvement and ramp-up; openhatch makes it easy to get involved with a project you can contribute to.

I think there’s still some work to be done — bugs about "you need a better logo" or "help, our docs are crap" don’t quite fit into this workflow. Still, good effort.

Comments Off

Historical Dwarf Pie

February 27, 2010
Tags:

Seen via JWZ: a fascinating story about a dwarf.

http://travelogue.betacantrips.com/wp-content/uploads/2010/02/wpid-b9vfl4b63kyki4z0uqLvC684o1_500.jpg

This is one of those things where I’m struck at how weird the world we live in is.

Comments Off