Introduction
Disclaimer: My personal opinion is still that Python is not a suitable language for production
environment and especially not for web development. When the author has the choice he would choose a
more suitable language for the web such as Java with JavaEE
or a language such as PHP (The author personally thinks Python
is on par with PHP) - or depending on the application Elixir but there are situations
where usage of Python might be a good idea - and then itās still better than C# based stuff and ASP.NET
anyways. And itās always a good idea to know how all of that stuff works - since itās about the same
for every new hyped technology (one will see that WSGI web applications work just like any other CGI
application anyways, there is no new black magic except transparent launching of the interpreter and many
fancy names for the components implementing it - as usual).
So Iāve came around WSGI (the protocol as well as the module) lately especially in the context of
Flask applications in a cloud environment and since Iāve been using a little bit more Python lately
in work context I thought that most of the tutorial Iāve read had been way to complicated and did not
really fit my view on the whole Internet infrastructure - and I also thought that it simply could not
be that complicated since the network and the whole stack really works the same as in the early 90āth
even today - so when it sounds complicated itās usually just formulated in a complicate fashion or to
sound fancy - but the principles are still the same and applications on the network also still work the same,
there is no such thing as too quick movement and development even though the tools evolve and the languages
change. So I thought to dive a little bit into the matter (and also configuration of uwsgi
which
turned out to be way more versatile than one will imagine after skimming over this blog article - basically
itās a great tool when one wants to build a cloud like horizontally scaling system - it offers much
flexibility that one will not use for a typical small to medium scale deployment anyways - usually one
could even just donāt use the tool anyways and deliver a micro HTTP server in ones application since the
gain of using an application server is not that large when doing small to medium scale application development
with Python - so I think the value it can provide is really underestimated in many cases).
Note that this blog article has been written from the viewpoint of someone who already developed many web
applications in different frameworks and many different languages (but never a larger project in one of
the Python frameworks for the web). So itās of course heavily biased by previous experience and
view of the whole infrastructure - itās of course also affected from a system and network administration
point of view as well as influenced by the knowledge of basic inner workings of the whole WWW infrastructure.
So itās not written from the point of view form a beginner for web development ā¦
What is WSGI?
The Web Server Gateway Interface is something thatās pretty similar that most of developers for web applications
remember from the early years - the common gateway interface (CGI). It mainly differs from pre-forked FastCGI in the
design goal - CGI had been a specification to launch any external application, supply information about the requests in system
environment variables and pass request payload via the standard input - as well as request output via standard
output. WSGI works similar though it has been built for Python in particular (though implementations such as
the uwsgi application server are perfectly capable of running applications written in Erlang, Ruby, Lua, Perl, Java, JavaScript, etc.).
On one side it āspecifiesā that each application has to be:
- A single callable (function, class with a
__call__
method, etc.) should be exposed thatās called
each and every incoming request
- The callable accepts two parameters:
- The first contains a CGI style environment - this is a Python dictionary (not a subclass - really
a
dict
) thatās mutable containing the environment variables and some required WSGI variables
- The second argument if often called
start_repsonse
and gets passed a callable. It accepts two
positional and one optional argument.
- The first parameter is the
status
that will be returned to the client - like 200 OK
or 404 NOT FOUND
.
- The second parameter contains a list of tuples that contain response headers. They are composed
of their name and their value - for example to only set
Content-type
and Content-language
the second argument could be set to the list [ ('Content-type', 'text/plain'), ('Content-language', 'en') ]
- The third optional parameter might be passed a python
exc_info
tuple in case the callable is
called out of an error handler. Since the output of previous start_response
has been cached
it can be replaced with an error response as long as the output buffers had not been flushed.
- The callable should return an iterable that contains a simple sequence of bytes that gets returned
to the browser. This can be either a simple list containing one or more bytes or for example for
large files an iterable that reads a file block wise and passes bytes on to the client in
an iterative fashion.
The start_response
callable buffers the parameters passed. As mentioned above - as long as output buffering
has not started flushing - an error handler can call it a second time to replace existing buffered data
by an error handlers output - though the author personally would not consider that a clean application design.
In any other case start_response
can be called more often.
As one can see this is pretty much the same thing that CGI
also did.
The API for the Python side of WSGI has been specified in PEP 3333
and is pretty short. In my opinion itās a good idea to read information about WSGI directly there to
get known to the ideas behind this interface even when it looks really familiar to anyone who has ever
written a CGI application in any programming language. WSGI is used by nearly every Python web framework
such as Flask, Django.
The life cycle
In contrast to more major and more sophisticated specifications such as JavaEE the life cycle of a WSGI
servlet is not specified in any way. This is container specific though most of them will load a module
once and call the callable more than once. This is especially the case for horizontally scaling
cloud infrastructure that spawns server processes on demand. Whenever the application gets replaced
most containers support swapping the loaded module in a graceful way - i.e. processing existing
running requests with the old code while hot reloading new code. There are containers though
that work like legacy CGI without Pre-Forking an loads code on demand though that leads to a
huge overhead with interpreted languages such as Python. Unfortunately most containers do not provide
clean life cycle callbacks so that you cannot launch background tasks or allocate shared resources
whenever the container is loaded (ok you could use __init__.py
for that) and cleanly release
them whenever the container gets evicted or replaced (this is the problem). For some application
servers as uwsgi
you are able to write extensions to provide some kind of background tasks - but
then they have to be deployed independent of the application again. This is in my opinion one of
the points where the whole specification needs major improvement before being really usable in
a general sense.
Environment variables
The following environment variables are required to be present (or omitted if they would be empty):
REQUEST_METHOD
is the HTTP request method (GET
, POST
, DELETE
, etc.)
SCRIPT_NAME
is the path of the web application exluding the domain (or empty for the root).
How this is handled depends on the application server configuration. Sometimes this is empty when
a single framework script handles all requests to a given virtual host.
PATH_INFO
is as usual the path information relative to the script name - or the full path in
case a single script handles all requests to a given vhost.
QUERY_STRING
is - as specified in the HTTP specification - everything following the question
mark in an URI in an undecoded form.
CONTENT_TYPE
may be empty or absent and contains the Content-type
header of the
request. This is not to be confused with the ACCEPT
header of course. This is often used
with POST
, PUT
or similar requests. See the HTTP specification for details - it works
the same as with any other language.
CONTENT_LENGTH
is used in conjunction with CONTENT_TYPE
.
SERVER_NAME
and SERVER_PORT
have to be strings that are always set to the given server
name and port.
SERVER_PROTOCL
is set to a value such as HTTP/1.0
or HTTP/1.1
dependend on the
used protocol
- All client supplied request headers should be contained in variables prefixed with
HTTP_
. For
example any Host:
header should be contained in HTTP_HOST
, any Accept
header in HTTP_ACCEPT
and so on.
- SSL state should be exposed in an application server dependent way to the application - this makes
applications relying for example on client side authentication less portable between application
servers and gateways.
In addition the dictionary also contains some other WSGI specific stuff:
wsgi.version
is the version number of the WSGI protocol as a tuple. For example (1, 0)
wsgi.url_scheme
contains the URL scheme used - most likely the string "http"
or "https"
wsgi.input
contains any client side input stream like POST parameters or uploaded files. Depending
on the application container this has to be consumed to prevent errors. This object can be used
like a file opened read only by open
- but without any seeking capabilities.
wsgi.errors
allows one to write errors in a well defined way to a text-mode file type output
stream. Terminate lines with \n
. This is usualy written into a logfile (like stderr for most
containers)
wsgi.run_once
evaluated to True
when the environment is some kind of one-shot launched
application where - like for traditional CGI applications - a new processes is launched for each
any every web request.
wsgi.multithread
evaluated to True
when multiple threads may call the application object
at once in the same process, wsgi.multiprocess
should evaluate to True
when multiple
instances of the same application might exist in multiple processes. The author personally likes to assume
this is the case in any case anyway.
The stream objects support read(size)
, readline()
, readlines(hint)
as well as __iter__()
for the sources and write(str)
, writelines(seq)
as well as flush()
for the sinks.
Optionally a container might provide a wsgi.file_wrapper
. This can be used to transmit
file like objects from the filesystem using operating system facilities like sendfile
.
WSGI containers
There is a number of dedicated WSGI containers that one can use. The most popular one being uwsgi
which is basically a simple wrapper around one of the event handling libraries and the language
plugin (Python being always present). It also allows for arbitrary plugins that might handle MQTT messages
or other stuff.
Other popular containers are:
- The green Unicorn gunicorn
- The greenlet based gevent
- CherryPi that also includes a web server fully written in Python
In addition there are browser plugins that speak the wsgi
protocol such as mod_wsgi
for the Apache httpd web server.
When using an application server one usually does not expose the application server directly to the
outside world - one usually uses a web server, at least a load balancer or any other component that
plays reverse proxy in front of it. Especially for serving statics, terminating SSL connections, etc.
The most common layouts use web servers such as nginx
or Apache http
in front of the
application servers. Also keep in mind that you usually donāt want to trust the pretty young and
novel implementations of those Python application servers and take the usual precautions of partitioning
your systems to isolate them in your environment as much as possible - but thatās a good practice anyways.
Installing uwsgi
on FreeBSD
To get started one might use the uswgi application server. This works
pretty fast especially for development and can also be configured for production environments. Note that
this requires a little bit more considerations than mentioned here in the beginning - donāt use uswgi
by simply launching your script on any production machine ā¦
Installation is pretty simple and can be done through packages or ports (donāt install via pip
though:
or
cd /usr/ports/www/uwsgi
make install clean
The latter approach of course allows one to set compile time options - uwsgi
is pretty flexible. Note that
this also installs the /usr/local/etc/rc.d/uwsgi
init script that allows one to launch uswgi
with the standard rc.conf
framework.
The available /etc/rc.conf
environment variables are:
uwsgi_enable
that can be set to YES
or NO
as usual
uwsgi_socket
for a WSGI socket (default /tmp/uwsgi.sock
) in
combination with uwsgi_socket_mode
(660
) and uwsgi_socket_owner
(uwsgi:www
)
uwsgi_emperor
that switches emperor mode
uwsgi_configfile
supplies an configuration file defaulting to /usr/local/etc/uwsgi/uwsgi.ini
uwsgi_vassals_dir
defaults to /usr/local/etc/uwsgi/vassals
thatās required for
cluster configuration
uwsgi_logfile
defaults to /var/log/uwsgi.log
uwsgi_pidfile
defaults to /var/run/uwsgi.pid
uwsgi_uid
and uwsgi_gid
to specify owner and group (default uwsgi:uwsgi
)
uwsgi_flags
allows specifying additional flags (defaults to -L
)
uwsgi_procname
might allow one to change the used program
uwsgi_profiles
allows one to supply a list of loaded uwsgi profiles (optional)
At time of writing the package does not install any sample ini
scripts though.
The most simple WSGI application
So letās start with the most simple hello world WSGI application. This can be written either imperative
or as a class implementation. The most simple way is the imperative structure - but it can be any
callable named application
:
import uwsgi
def application(env, start_response):
start_response(
"200 OK",
[
('Content-type', 'text/plain'),
('Content-language', 'en')
]
)
return [
b"Hello world!"
]
To launch a simple test version of this script on the local development machine one can directly launch
it using the --wsgi-file
parameter for uwsgi
:
$ uwsgi --http :1234 --wsgi-file helloworld.py --need-app
The argument --http :1234
starts the application server listening to requests at http://localhost:1234
.
In addition to http
uswgi is also capable of listening on:
- An
--https-socket
that of course requires certificate and key configuration
- An
--fastcgi-socket
when being just a wrapper for FastCGI calls
Note that the http or https sockets donāt have to be real network sockets - one can also use a Unix domain socket
which might be of special interest behind a reverse proxy (that one should use anyways) when one restricts
network access of the container itself. This can be done by simply specifying the filename of the Unix domain
socket:
$ uwsgi --http /path/to/socket.sock --wsgi-file helloworld.py --need-app
Running from a package
So running from a file is ok for simple demonstration purposes and the most simple applications but this might
not be what one usually has in mind when deploying a Python application. More likely ones application
will be packaged using setuptools
such as any other Python application.
To build the package one requires again a simple pyproject.toml
and a simple setup.cfg
. For demonstration
purposes the author used the following pyproject.toml
:
[build-system]
requires = [
"setuptools>=42",
"setuptools-git-versioning",
"wheel"
]
build-backend = "setuptools.build_meta"
[tool.setuptools-git-versioning]
enabled = true
and the following setup.cfg
:
[metadata]
name = modulewsgihelloworld-tspspi
version = 0.0.1
author = Thomas Spielauer
author_email = pypipackages01@tspi.at
description = Just a demonstration hello world project on how to package WSGI applications
long_description = file: README.md
long_description_content_type = text/markdown
url = https://github.com/tspspi/modulewsgihelloworld
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: BSD License
Operating System :: OS Independent
[options]
package_dir =
= src
packages = find:
python_requires = >=3.8
install_requires =
matplotlib >= 3.4.1
[options.packages.find]
where = src
In a src/modulewsgihelloworld/helloworld.py
the following source has been added:
import uwsgi
def mainhello(env, start_response):
start_response(
"200 OK",
[
('Content-type', 'text/plain'),
('Content-language', 'en')
]
)
return [
b"Hello world!"
]
To be a valid module src/modulewsgihelloworld/__init__.py
is existing (but just an empty file)
and to prevent problems when building the package with git versioning (though not releasing
that one on PyPi or course) one can also add a gitignore to src/modulewsgihelloworld/.gitiginore
ignoring the *.egg-info
and a gitignore to dist/.gitignore
that ignores and *.tar.gz
and *.whl
. This results in the following directory structure:
|- pyproject.toml
|- setup.cfg
|- dist
| |- .gitignore
|- src
|- modulewsgihelloworld
|- __init__.py
|- helloworld.py
Then the module got built (see my previous blog article for details using
After installing the given module (pip install dist/*.whl
) the module can be executed using the module
option:
$ uwsgi --module modulewsgihelloworld.helloworld:mainhello --http :1234 --need-app
When using packages one is now really able to use either oneās own package repository, package files or
even PyPi to distribute and upgrade applications using ones build automation system
Reading uwsgi configuration from an --ini
or --yaml
file
Above the configuration for uwsgi
has been read from the command line. This is of course inconvenient
especially when one launches it with more elaborate features (uwsgi supports clustering, local synchronized
caches, various life cycle management algorithms, different event loops, filesystem mounting and unmounting
when running a horizontally scalable cloud system; it supports reloading on different external mechanisms,
monitoring via metrics and statistics, multicasting, async mode, lazy loading mode, different clock sources,
static file serving, etc.). Most of the features are especially important when moving to a production
system or writing more complex applications.
Some of the interesting options for a beginner that one should know of are:
- Binding with HTTP to a specific port or Unix domain socket using
--http :port
or --http /path/to/sockfile.sock
- Binding with WSGI protocol to a specific port or Unix domain socket using
--socket :port
or --socket /path/to/sockfile.sock
- Requiring that an application can be loaded with
--need-app
- Running from a Python file with
--wsgi-file filename.py
- Running from a Python module using
--module my.example.module.file:MyHandler()
that might also be
installed in a virtual environment specified by --virtualenv /path/to/venv
- Automatically (only during development) reload a module when one of itās files changed using
--py-autoreload
so
one does not have to reload the file every time.
- Importing modules with
-pyimport
- Launching multiple processes and threads using
--master --processes NPROCESSES --threads NTHREADS
- Exporting statistics via
--stats 127.0.0.1:2345
- When one requires Python threads one has to set
--enable-threads
because GIL else is
disabled and threads are not available. Though threads on CPython are not really usable as of
today anyways
- When one launches the application server as root (bad idea anyways) one might specify
--uid
and --gid
to
drop privileges as soon as possible after binding to sockets.
- To daemonize the app after launching supply
--daemonize
or after application loading using --daemonize2
All of those options can be set in an ini
or yaml
file thatās then passed to uwsgi
. Using
an ini
this might look like the following:
[uwsgi]
http = :1234
https = :1235,mycert.crt,mykey.key
chdir = /my/app/directory
module = my.example.module.file:MyHandler()
virtualenv = /path/to/venv
master = true
processes = 4
threads = 8
One can then launch uwsgi
using:
$ uwsgi --ini filename.ini
This article is tagged: