Posts Tagged ‘urllib2’

I found the documentation on urllib2 a bit unclear about how to get cookie handling working properly.

I was working on a python script that needed to contact the OpenStreetMap web server, login with my OSM credentials and interact with the website.

The first step is to setup a urllib2 opener instance that is configured to store cookies.

import cookielib,Cookie,urllib2,urllib
import xml.etree.cElementTree as ElementTree

cookies = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookies))

This will create an opener that can be used to retrieve URL’s. Any cookies set in the HTTP response will be stored in the cookie’s cookie jar. If I needed to add additional openers (ie for special redirect handling) I would just add them as additional parameters to the build_opener call. ie urllib2.build_opener(handler1, handler2,handler3)…

Next we need to contact OpenStreetMap to get a blank login form. The blank login screen has a hidden variable ‘authenticity_token’ that needs to be passed back as part of the POST with my login credentials

inptag = '{http://www.w3.org/1999/xhtml}input'
formtag = '{http://www.w3.org/1999/xhtml}form'
# fetch the blank login form
response_tokenfetch = opener.open(request)
html = response_tokenfetch.read()
htmlfile=StringIO.StringIO(html)
# parse the HTML elements in the form
# extract any input fields for later resubmission
# this will pick up the authenticity_token and anything else
xml_tree = ElementTree.parse(htmlfile)
for form in xml_tree.getiterator(formtag):                
    for field in form.getiterator(inptag):
        if 'name' in field.attrib and 'value' in field.attrib:
                login_payload[field.attrib['name']] = field.attrib['value']
login_payload['username'] = username
login_payload['password'] = password
login_payload['remember_me'] = 'yes'
login_payload['cookie_test'] = 'true'

Next we submit the LOGIN request as a POST. Any session cookies returned as part of the blank form will be added to the second request.

cookies.add_cookie_header(request)
response = opener.open(request,urllib.urlencode(login_payload))

If our login was successful then cookies contains an _osm_session and _osm_username that will be used in subsequent API calls.

request2=urllib2.Request('http://api06.dev.openstreetmap.org/user/stevens/inbox)
cookies.add_cookie_header(request2)
response2 = opener.open(request2)
html=response2.read()

You could then parse the HTML to extract a list of messages.
If your using the formal OpenStreetMap API (ie calls under /api/0.6/…) then you should instead use oauth for authentication instead of logging in through the website. Some OSM features such as messaging can only be accessed by pretending to be a web session and parsing/faking HTML.