20.6. urllib2 — extensible library for opening URLs

20.6. urllib2 — extensible library for opening URLs — Python v2.7.3 documentation

class urllib2.Request(url[, data][, headers][, origin_req_host][, unverifiable])¶

This class is an abstraction of a URL request.
url should be a string containing a valid URL.
data may be a string specifying additional data to send to the server, or
None if no such data is needed. Currently HTTP requests are the only ones
that use data; the HTTP request will be a POST instead of a GET when the
data parameter is provided. data should be a buffer in the standard
application/x-www-form-urlencoded format. The
urllib.urlencode() function takes a mapping or sequence of 2-tuples and
returns a string in this format.
headers should be a dictionary, and will be treated as if add_header()
was called with each key and value as arguments. This is often used to “spoof”
the User-Agent header, which is used by a browser to identify itself –
some HTTP servers only allow requests coming from common browsers as opposed
to scripts. For example, Mozilla Firefox may identify itself as "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11", while urllib2‘s
default user agent string is "Python-urllib/2.6" (on Python 2.6).
The final two arguments are only of interest for correct handling of third-party
HTTP cookies:
origin_req_host should be the request-host of the origin transaction, as
defined by RFC 2965. It defaults to cookielib.request_host(self). This
is the host name or IP address of the original request that was initiated by the
user. For example, if the request is for an image in an HTML document, this
should be the request-host of the request for the page containing the image.
unverifiable should indicate whether the request is unverifiable, as defined
by RFC 2965. It defaults to False. An unverifiable request is one whose URL
the user did not have the option to approve. For example, if the request is for
an image in an HTML document, and the user had no option to approve the
automatic fetching of the image, this should be true.

class urllib2.OpenerDirector¶

The OpenerDirector class opens URLs via BaseHandlers chained
together. It manages the chaining of handlers, and recovery from errors.

class urllib2.BaseHandler¶

This is the base class for all registered handlers — and handles only the
simple mechanics of registration.

class urllib2.HTTPDefaultErrorHandler¶

A class which defines a default handler for HTTP error responses; all responses
are turned into HTTPError exceptions.

class urllib2.HTTPRedirectHandler¶

A class to handle redirections.

class urllib2.HTTPCookieProcessor([cookiejar])¶

A class to handle HTTP Cookies.

class urllib2.ProxyHandler([proxies])¶

Cause requests to go through a proxy. If proxies is given, it must be a
dictionary mapping protocol names to URLs of proxies. The default is to read
the list of proxies from the environment variables
<protocol>_proxy. If no proxy environment variables are set, in a
Windows environment, proxy settings are obtained from the registry’s
Internet Settings section and in a Mac OS X environment, proxy information
is retrieved from the OS X System Configuration Framework.
To disable autodetected proxy pass an empty dictionary.