Watch out for the Linode 50 mbps Bandwidth Cap

Linode is great.  One of the things that is great about Linode is their private network.  When a Linode is allocated, you can choose to give your Linode a Private IP address in addition to the Public IP address that all Linodes receive.  Data packets routed through the Private IP address do not count against your billing quotas and get to use a very high capacity private network.

Linode

Linode caps the outbound bandwidth of each of your instances to 50 mbps, regardless of which network (public or private) your data is directed to.  This cap is in place to help customers manage their bandwidth quota in case some process goes awry.  But if you intend to process a lot of data through your Linode, this cap will get in the way.  If you run into it, your packets will simply get dropped and your TCP connections will behave badly.

Before this happens, you might want to take a look at your traffic graphs for high throughput Linodes.  Below is a traffic graph from a node that is exceeding its quota.  The clipping is dramatic in this picture.  In cases when the maximum rate is just barely over 50 mbps, the clipping effect in the graph will be more subtle, but the effect of losing packets will be just as bad.

Linode clipping outbound bandwidth at 50 mbps

Linode support will gladly raise the cap on your outbound traffic.  Just drop them a note and they will raise it.  Your node will enjoy increased bandwidth after a reboot.

Video Summarization is the Biggest Problem with Internet Cameras

What happens when you record every frame emitted by an IP-camera? You end up with too much data to make sense of. I now have nearly unlimited data storage, but have little interest in reviewing everything stored. Watching recorded footage in real-time is too time-consuming to be enjoyable, or even reasonable. While playing it back at 2x or 4x speed might sound like a good idea, that’s still a lot of video to look at.

Please Summarize!

Most video editing software offers a “scrubbing” operation for rapidly finding a point in time. Scrubbing is the act of manually moving the transport control backwards and forwards through the images. If you’ve ever scrubbed looking for a single frame you remember seeing, you’ll have noticed that it’s sometimes hard to find the frame. Your monitor is displaying no more than 60 or 75 frames per second: if you scrub over a time period with a resulting rate faster than this, you are not seeing everything.

“Video summarization” is a field of study aimed at developing algorithms and methods to help abstract and identify interesting features in a segment of video to help direct viewer’s attention. Here is a great quote describing video summarization [ 1 ].

Video summarization methods attempt to abstract the main occurrences, scenes, or objects in a clip in order to provide an easily interpreted synopsis.

At Sensr.net, we consider video summarization to be an important part of our technology, recognizing that keeping a collection of all the frames your camera emits is just too much data to use. Our summarization techniques are straightfoward: we use motion-detection algorithms and save only those frames. We also offer a simple form of “hierarchical video-summarization.” When you look at a shot of all of the hours-of-the-day you are presented with the most important 24 frames of that day. Similarly, the days-of-the-month are summarized by the most important frame of each day.

Sensr.net has been hard at work laying the “pipes” for moving the frames emitted by internet cameras through our processors and into the cloud. You can expect to see more from us in the video summarization arena. Until then, take a look at this excellent slide presentation and think about what sorts of summarization you would like to see for your internet camera application.

[1] http://www.cs.utexas.edu/~grauman/courses/spring2008/slides/Video_Summarization.pdf

Benign Surveillance – a Friendly Eye

What if you could look at any public place, anywhere in the world, from your home.  What if there was a network of shared cameras, available for your use.  Would you look at a scenic vista over San Carlos?  Would you see who is arriving at Ritual Coffee in the Mission?  Would you look to see if the sun was shining out at Lands End?

What would you call the activity that you’re doing?

  • Is it surveillance?
  • Is it peeping?
  • It is watching?
  • Are you a voyeur?

None of these words describe very well the benign use of internet cameras to take a look at something somewhere else.  Sensr.net provides such a service and our tag line is simply “Watch your Stuff!”  I think this is a pretty good summary of what people are currently doing with Sensr.net.  They’re also doing things like sharing their camera views and posting clips to Twitter and Facebook.

Often, when I’m talking to people about Sensr.net and internet cameras, I receive a partially hostile response.  People associate cameras with centralized authority or control.  Even the word “surveillance” implies watching someone because they are suspect [http://dictionary.reference.com/browse/surveillance].  People worry about their privacy.  The meme that makes us worry about being watched seems pretty deeply ingrained in our society.  I’ve wondered where that comes from.

Watching has long been used as a form of control.  In the 18th century, a philosopher named Jeremy Bentham designed a type of prison called the Panopticon.  The word combines the roots for “observer” (-opticon) with “all” (pan-).  In this prison, an observer could see everyone no matter where they were.  The idea was that if no one knew when they were being watched or when they weren’t all prisoners would behave.  Closed-Circuit Television (CCTV) is sometimes considered a modern incarnation of the Panopticon.  But in at least some studies it’s been shown that all this watching doesn’t really cut crime rates [http://arstechnica.com/tech-policy/news/2008/05/problems-with-the-panopticon-uks-cctv-doesnt-cut-crime.ars].

So what happens when remote cameras simply become an adjunct to the other ways we perceive our environment?  Then internet cameras simply become tools for viewing.  One of my co-founders noted that perhaps the feeling of invasion stems from the fact that there are two different groups of people: the watchers and the watched.

With Sensr.net we have an opportunity of turning things around, contrary to traditional video surveillance.  With Sensr.net, the watchers and the watched can be part of the same social network. [YB 2010]

I really like this idea, but then I come back to the problem that generated this blog post: what is a good word to describe the service that Sensr.net is providing?  I don’t have one yet, so I might have to invent my own.  How about a word with Latin roots: “amicus-oculus” – a friendly eye!  Not very sexy, but not scary either.

If you’re reading this and you have a good suggestion for a word or phrase we can use to describe Sensr.net, please contribute it below as a comment.

Running Twisted Daemons with twistd

Twisted ships with a nice daemon runner called “twistd” that can do a lot of different things for your Twisted plugin or application.  It can set the UID/GID of your process, open up a log file and manage a PID file for you.   All of this is configured with command-line options to twistd.

Twisted Tower of the de Young Museum

While each of these options is well documented, the way in which they interact and the order in which these properties are applied to daemon creation are not.  Starting a daemon involves grabbing some ports, changing UID/GID, opening up and managing both a logfile and a PID file.  The management of the logfile and PID file is complicated by the fact that the UID of twistd changes between the time these files are created and when twistd wants to modify them.

This post explains the sequence of operations that twistd performs for starting a daemon defined in a TAC file on Unix.  We will consider only a minimal subset of the options that twistd handles so that we can focus on the interactions due to the daemon’s UID/GID changing.  Our assumptions are that:

  • twistd is started as user root,
  • our daemon will run at reduced privileges,
  • our application is defined in a TAC file and
  • we are discussing Unix only.

Along the way we’ll explain how the order in which twistd performs its operations creates a few “gotchas” that one might run into when setting up logging and a PID file.

An Overview of the twistd Daemon-starter

Although twistd actually has more options than are shown here, the ones that we are interested in for this article are summarized below.

twistd --uid=UID --gid=GID --umask=UMASK --chroot=CHROOT
     --rundir=RUNDIR --pidfile=PIDFILE --logfile=LOGFILE -y myapp.tac

Here is an outline illustrating the steps that twistd performs in order to create the daemon from the TAC file. We’ll describe each of the steps in turn.

  1. preApplication
    1. checkPID
  2. createApplication
    1. readTacFile
    2. startLogger
  3. postApplication
    1. setupEnvironment
      1. chroot
      2. chdir
      3. umask
      4. daemonize
      5. openPIDFile
    2. privilegedStartService
    3. switchUidGid
    4. startApplication
    5. startReactor
    6. removePIDfile

The “preApplication” phase is what is executed before the application is even created.

  • checkPID – this checks for the existence of a prior PID file and removes it if the old PID does not correspond to a currently running process.  This step is performed as user root

The “createApplication” phase deals with the instantiation of the object that defines a Twisted application.  An application is a service object created with a call to the function twisted.application.service.Application.

  • readTacFile – Read the contents of the TAC file “myapp.tac” as Python source code.  Evaluate it in a completely empty namespace. Upon completion return the value of the variable “application” (if there is one) and discard everything else.  Note: this step is performed as user root.  The code in the TAC file is free to import Python modules at will and can interact with the file system, but it is not able to access global Python state and can only return a single value.
  • startLogger – If the Application object returned in the previous step has not defined a logger, then give this application a default rotating logfile.  This step is performed as user root.  The logfile will be created with name LOGFILE and owned by user “root.”

The “postApplication” phase does most of the real work of running the Twisted application.

  • setupEnvironment – Call chroot(CHROOT), chdir(RUNDIR) and set the umask to UMASK.  Daemonize the process by doing the double-fork trick.  Lastly, create a PID file with name PIDFILE.  The PID file will be created by user “root.”
  • privilegedStartService – Grab the ports that are needed.
  • switchUidGid – Change the daemon’s user and group IDs to UID and GID.
  • startApplication – Call startService on our application’s service object.
  • startReactor – Start the Twisted event-reactor in motion.  It is while the reactor is running that the logfile will be rotated.  Log management will be done with the privileges of UID/GID.
  • removePIDFile –  After the reactor has finished, remove PIDFILE.  This will be done with permissions UID/GID.

Implications for Logfile Rotation

Using the configuration described here, the LOGFILE will be created as user “root” and group “root”, but rotated as user UID and group GID.  If you want rotation to work as advertised it is necessary to put the LOGFILE in a directory in which UID/GID has permissions to rename files.

Implications for PID file creation

The PIDFILE will be created as user “root” but when it comes time to remove it, the daemon process will have the permissions UID/GID.  If you want your daemon to be able to remove its PID file, then it would be placed in a directory in which UID/GID has permissions to remove files.

Conclusion

Twisted’s daemon-runner is a useful and well-tested program that has been in use for a long time.  Some of its side-effects are due to the order in which it performs its steps.  This note laid out some of these steps to explain how process permissions interact with logfile rotation and PID file removal.