Running Twisted Daemons with twistd

Twisted ships with a nice daemon runner called “twistd” that can do a lot of different things for your Twisted plugin or application.  It can set the UID/GID of your process, open up a log file and manage a PID file for you.   All of this is configured with command-line options to twistd.

Twisted Tower of the de Young Museum

While each of these options is well documented, the way in which they interact and the order in which these properties are applied to daemon creation are not.  Starting a daemon involves grabbing some ports, changing UID/GID, opening up and managing both a logfile and a PID file.  The management of the logfile and PID file is complicated by the fact that the UID of twistd changes between the time these files are created and when twistd wants to modify them.

This post explains the sequence of operations that twistd performs for starting a daemon defined in a TAC file on Unix.  We will consider only a minimal subset of the options that twistd handles so that we can focus on the interactions due to the daemon’s UID/GID changing.  Our assumptions are that:

  • twistd is started as user root,
  • our daemon will run at reduced privileges,
  • our application is defined in a TAC file and
  • we are discussing Unix only.

Along the way we’ll explain how the order in which twistd performs its operations creates a few “gotchas” that one might run into when setting up logging and a PID file.

An Overview of the twistd Daemon-starter

Although twistd actually has more options than are shown here, the ones that we are interested in for this article are summarized below.

twistd --uid=UID --gid=GID --umask=UMASK --chroot=CHROOT
     --rundir=RUNDIR --pidfile=PIDFILE --logfile=LOGFILE -y myapp.tac

Here is an outline illustrating the steps that twistd performs in order to create the daemon from the TAC file. We’ll describe each of the steps in turn.

  1. preApplication
    1. checkPID
  2. createApplication
    1. readTacFile
    2. startLogger
  3. postApplication
    1. setupEnvironment
      1. chroot
      2. chdir
      3. umask
      4. daemonize
      5. openPIDFile
    2. privilegedStartService
    3. switchUidGid
    4. startApplication
    5. startReactor
    6. removePIDfile

The “preApplication” phase is what is executed before the application is even created.

  • checkPID – this checks for the existence of a prior PID file and removes it if the old PID does not correspond to a currently running process.  This step is performed as user root

The “createApplication” phase deals with the instantiation of the object that defines a Twisted application.  An application is a service object created with a call to the function twisted.application.service.Application.

  • readTacFile – Read the contents of the TAC file “myapp.tac” as Python source code.  Evaluate it in a completely empty namespace. Upon completion return the value of the variable “application” (if there is one) and discard everything else.  Note: this step is performed as user root.  The code in the TAC file is free to import Python modules at will and can interact with the file system, but it is not able to access global Python state and can only return a single value.
  • startLogger – If the Application object returned in the previous step has not defined a logger, then give this application a default rotating logfile.  This step is performed as user root.  The logfile will be created with name LOGFILE and owned by user “root.”

The “postApplication” phase does most of the real work of running the Twisted application.

  • setupEnvironment – Call chroot(CHROOT), chdir(RUNDIR) and set the umask to UMASK.  Daemonize the process by doing the double-fork trick.  Lastly, create a PID file with name PIDFILE.  The PID file will be created by user “root.”
  • privilegedStartService – Grab the ports that are needed.
  • switchUidGid – Change the daemon’s user and group IDs to UID and GID.
  • startApplication – Call startService on our application’s service object.
  • startReactor – Start the Twisted event-reactor in motion.  It is while the reactor is running that the logfile will be rotated.  Log management will be done with the privileges of UID/GID.
  • removePIDFile –  After the reactor has finished, remove PIDFILE.  This will be done with permissions UID/GID.

Implications for Logfile Rotation

Using the configuration described here, the LOGFILE will be created as user “root” and group “root”, but rotated as user UID and group GID.  If you want rotation to work as advertised it is necessary to put the LOGFILE in a directory in which UID/GID has permissions to rename files.

Implications for PID file creation

The PIDFILE will be created as user “root” but when it comes time to remove it, the daemon process will have the permissions UID/GID.  If you want your daemon to be able to remove its PID file, then it would be placed in a directory in which UID/GID has permissions to remove files.

Conclusion

Twisted’s daemon-runner is a useful and well-tested program that has been in use for a long time.  Some of its side-effects are due to the order in which it performs its steps.  This note laid out some of these steps to explain how process permissions interact with logfile rotation and PID file removal.

Twisted: learning about Cred and Basic/Digest Authentication

One of Twisted’s best features is it’s credentials plug-in architecture.  Twisted abstracts the notion of username/password across many different protocols and password-protection schemes.  By abstracting the idea of “credentials” it is possible to implement one password checking facility and use it to protect both FTP and HTTP logins.  Twisted has a deep class hierarchy and a fairly significant learning curve however, so it’s difficult to get your head around some of the concepts until you dive in.

Twisted Matrix Labs

This article will share a little bit of what I’ve learned recently using Twisted.  I’ll show how to implement simple password protection of a static web resource using Basic and Digest auth schemes, and I’ll show an example that doesn’t quite work!  Then I’ll explain what is wrong with that example and fix it. The code snippets shown are complete programs that may be a good starting point for experimenting further.

Password Dictionary

The first example implements a password dictionary checker for a simple web site protected with Basic Auth.  The PasswordDictChecker class implements the interface “ICredentialsChecker.”  Checking a credential is performed by the method requestAvatarId: if the password given matches the password sent, then the username is returned, otherwise, the lookup fails.

Our HTTP Realm (class HttpPasswordRealm) maps authenticated users to resources. In our case, there is only one resource and it is fixed: self.myresource. Any authenticated user receives the same fixed resource regardless of their username.

Our web resource is a simple static example. We render the last part of the path as part of the page.

Our main program stitches all of the pieces together. We mount our resource at http://localhost:8081/example/foo, and it is protected by the passwords in the dictionary. Run this program and try it out. Any of the three passwords will get you to the same resource.

Incorrectly extending our First example to Digest Authentication

In the next example, we substitute a DigestCredentialFactory using md5 encryption for our BasicCredentialFactory of the first example.  This seems to be a straightforward substitution: instead of using the Basic auth scheme, I want Twisted to use the Digest auth scheme with the same password dictionary.  Unfortunately, this program does not allow access to the resource. It was not entirely obvious to me why not. If you run this, you will see that no matter what passwords you type, you are not granted access.  Twisted doesn’t complain, but it does deny you access to the resource.

For a while I believed that there was a problem with Twisted.  There was not: there was a problem with my understanding of the cred system. Our class PasswordDictChecker implements the interface for checkers.ICredentialsChecker – a credentials checker.  What I didn’t get my head around at first was the sub-interface specification.  Our first PasswordDict says that the “credentialsInterfaces” that it implements are only for credentials.IUsernamePassword. What I observed was that the code for requestAvatarId wasn’t even called in this case.  It turned out that the interface specification resulted in the credentials being rejected even before password comparison was being performed.

Upon further research I understood that the Digest authentication method did not supply credentials of the type IUsernamePassword. Instead, it supplies credentials of the type IUsernameHashedPassword. Because this “type” is not in the list of credentialInterfaces, the authentication machinery cannot find a checker for our Digest authentication scheme, and all logins fail.

Digest Authentication

Extending our example to Digest auth involves – adding credentials.IUsernameHashedPassword to the list of credentialInterfaces – abstracting password checking from the equals operator to using the checkPassword method In the example below, we’ve abstracted our simple password dictionary checker to one that abstracts to two types of credentials interfaces and uses the method call “checkPassword” to verify password validity.  This password checker is versatile enough to be used with both Basic and Digest authentication.  (It is also flexible enough to use with other protocols, such as FTP.) As a matter of fact, this password checker is essentially the same as one bundled with Twisted that is appropriately called: twisted.cred.InMemoryUsernamePasswordDatabaseDontUse.
(Do not use it because in-memory passwords can be a security hole.)

Conclusion

This article shows a straightforward implementation of password checker that authenticates against an in-memory dictionary. It used this checker with HTTP Basic auth, and then attempted to use the same checker with HTTP Digest auth. Using the abstraction facilities of the Twisted Cred system, we finally implemented a dictionary password checker that was compatible with either Basic or Digest auth.

Python multithreaded daemon with SIGTERM support (a recipe)

I like to use Python to stitch together little programs into services that run as daemons.  Python makes it easy to run multiple threads with modern synchronization constructs and to run Unix commands and pipes using the “subprocess” module.  However, I was having trouble cleanly killing my threads and subprocesses when I wanted to restart the daemon.  Some of the reports I had read on the web warned against using multithreading, subprocesses, daemonization and signals together, but it’s all working fine for me.  I thought I’d share my recipe and some of what I had learned here.

Python does not have a standard daemon module

Creating a daemon on Unix requires a “double-fork” recipe that I can only say is partly magic.  The recipe cited here seems to be the most-often cited resource for how to fork a daemon.  It dates from 2001.

http://code.activestate.com/recipes/66012/

One may wonder why this recipe hasn’t been turned into a standard Python daemon package.  I can think of two reasons.  One: the code is small enough that it doesn’t deserve to be a package.  Two: a good package would handle many possible daemonization needs.  This recipe imposes restrictions on the daemon and only works for simple cases.

If you’re will to accept its restrictions (like only the normal IN/OUT/ERR file descriptors open) then it seems to be fine.  I was quite happy to keep my service simple.  Stdout and stderr are the only two streams used, and the only signal handled is SIGTERM.

In the end, I chose to use the variant posted by Clark Evans here that added the daemon start/stop/restart behavior.

http://code.activestate.com/recipes/66012/#c9

Python masks signals to subprocesses

This is where I got a little hung-up.  My service was running subprocesses: both as simple commands and as long-running threads. I was having a hard time terminating sub-programs when I wanted my service to stop.

I found a great resource in this article:

http://www.doughellmann.com/PyMOTW/subprocess/

Python masks the signals to subprocesses it starts.  (Here “masks” means “does not propagate”) If you use the subprocess module to start a long-running job like this “sleep” command:

 import subprocess
 job = subprocess.Popen(["sleep", "6000"])
 (se, so) = job.communicate()

and then kill the parent Python process with a SIGTERM, the child “sleep” process will keep on running.  The article referenced above explains how to set a Unix “session ID” (os.setsid) and how to send a signal to a “process group” (os.killpg).  I modified the technique a little bit in the interest of simplification.

Use the threading.setDaemon() flag

Python will not exit if normal threads are still running: you must shut them down explicitly.  However, Python will exit if only “daemon” threads are running.  A daemon thread is simply something you deem unimportant enough that it does not require an explicit shut-down step.  Write a daemon thread like this:

class myThread(threading.Thread):
  def __init__(self, args ...):
  ...
  threading.Thread.__init__(self)
  self.setDaemon(True)

Putting it all together

SIGTERM_SENT = False

def sigterm_handler(signum, frame):
    print >>sys.stderr, "SIGTERM handler.  Shutting Down."

    global SIGTERM_SENT
    if not SIGTERM_SENT:
        SIGTERM_SENT = True
        print >>sys.stderr, "Sending TERM to PG"
        os.killpg(0, signal.SIGTERM)

    sys.exit()

def main():
    # set session ID to this process so we can kill group in sigterm handler
    os.setsid()
    signal.signal(signal.SIGTERM, sigterm_handler)

    while True:
        # ... run daemon threads and subprocesses with impunity.

    # ... this function never returns ... 

from daemonize import startstop

if __name__== "__main__":
    startstop(stdout="/var/log/example.log",
              pidfile="/var/run/example.pid")
    main()

Here’s what we came up with.  (Start at the bottom.)  The startstop function does the daemonization double-fork.  Thus, the PID of the daemon is a grandchild of starting Python process.  When main() is called, the effective PID is that of the daemon.  Calling os.setsid() sets the session id to the PID of the daemon.  (The PID is the one written to the pid-file).

The sigterm handler is called when a SIGTERM arrives.  Its main purpose is to send SIGTERM to the process group of the session.  (Recall that Python has masked the signals to its subprocess children, so we have manage sending the signals.)  This step terminates the child processes.  However, it also re-sends SIGTERM to the main daemon process – which we didn’t want.  We use the SIGTERM_SENT flag so that on the second receipt of the TERM signal, we ignore it.

In the end, this recipe is fairly simple.  Your mileage may vary, but if keep your use of subprocesses and threads simple, this might work for you too.