Cross-Origin Resource Sharing for JSON and RAILS

CORS (Cross-Origin Resource Sharing) is a protocol built on-top of HTTP for allowing Javascript on a page originating from one site to access methods on another site.  This is the preferred method for allowing Javascript code to escape from its default Same-Origin
Policy
.  While the protocol has been around for a few years, and is built into all of the major browsers, the protocol does not seem to be widely documented.  Here are some experiences I’ve encountered while enabling access cross-origin access for JSON from a Rails server.

Background

The Same-Origin policy restricts Javascript code to making Ajax calls to the site that its containing page came from.  For example, Javascript on webpage from http://mysite.com is not permitted to make Ajax calls to a web-service at http://othersite.com/method.json.  This policy is a security measure to prevent unwitting visitors to a website from executing malicious code in their browsers.

For web-service implementors, this is an annoying restriction.  Allowing Javascript to access data from multiple sites would allow programmers to create browser-based mash-ups.

Server-side proxying is a traditional method for getting around the Same-Origin restriction for Ajax requests.  With proxying, the owner of mysite.com implements a copy of the remote method that repeats the request from the page-owner’s own site.  The server on “mysite.com” must process each remote method call by calling the method on “othersite.com” so that it can return the results to the browser.

http://mysite.com/method.json --> http://othersite.com/method.json

The great advantage of this method is that it works with any browser.  The drawback is that it imposes work on web-service subscribers that seems redundant.

Cross-Origin Resource Sharing

CORS is a protocol negotiated between a browser and a web-service that tells the browser that it is “OK” to execute Javascript code from a cross-domain call.  The specification covers “Simple” transactions and complex transactions that use a “Preflight” request.  Cross-origin JSON requests with non-standard headers are not “Simple” and require the “Preflight” request.

A great introduction to the CORS protocol appears here:
http://www.nczonline.net/blog/2010/05/25/cross-domain-ajax-with-cross-origin-resource-sharing/

and  an example of the CORS transactions appears here.

http://arunranga.com/examples/access-control/preflightXSInvocation.txt

The “Preflight” request is a new HTTP verb called OPTIONS.  In a browser implementing CORS, each cross-origin GET or POST request is preceded by an OPTIONS request that checks whether the GET or POST is OK.  If it is, the server must return some headers to allow the subsequent GET or POST.  This is actually a wonderful capability.  The server can allow or disallow remote access on a per-method basis, with access determined by HTTP referrer, IP or any other criterial.

The OPTIONS request contains Access-Control headers that are part of the CORS specification.  The response must reply to these headers to allow the subsequent GET or POST to proceed.

For example, an access to

GET http://othersite.com/method.json

would be preceeded by an OPTIONS method that looks like this.

OPTIONS http://othersite.com/method.json
Origin: http://mysite.com
Access-Control-Request-Method: GET

The server would respond with an empty response body of type
“text/plain” that contains headers allowing the request.

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Max-Age: 1728000
Content-Length: 0
Content-Type: text/plain

Custom Headers

If your application uses non-standard headers, you must take special steps to permit them or the browser will flag a CORS violation.   I ran into this restriction in the application I was writing.  Unfortunately, the security violation messages from the browser are obscure, and it took me a while to figure this out.

In our application, our Javascript client uses prototype.js to make Ajax calls.  Prototype adds the following headers to the request.

X-Requested-With: XMLHttpRequest
X-Prototype-Version: N.N.N.N

Our server must explicitly allow these headers in the CORS exchange or the browser will disallow the cross-origin request. The OPTIONS request will specify the headers it wants to add. Our OPTIONS/Response exchange looks like this.

OPTIONS http://othersite.com/method.json
Origin: http://mysite.com
Access-Control-Request-Method: GET
Access-Control-Request-Headers: X-Requested-With, X-Prototype-Version

Response:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Allow-Headers: X-Requested-With, X-Prototype-Version
Access-Control-Max-Age: 1728000
Content-Length: 0
Content-Type: text/plain

CORS in Rails

I implemented the CORS protocol in a Rails application with just a couple of filter methods added to my controller. Here they are. (If you want to follow this technique, you’ll need to make sure your routes allow access to HTTP “:options” methods.)

before_filter :cors_preflight_check
after_filter :cors_set_access_control_headers

# For all responses in this controller, return the CORS access control headers.

def cors_set_access_control_headers
  headers['Access-Control-Allow-Origin'] = '*'
  headers['Access-Control-Allow-Methods'] = 'POST, GET, OPTIONS'
  headers['Access-Control-Max-Age'] = "1728000"
end

# If this is a preflight OPTIONS request, then short-circuit the
# request, return only the necessary headers and return an empty
# text/plain.

def cors_preflight_check
  if request.method == :options
    headers['Access-Control-Allow-Origin'] = '*'
    headers['Access-Control-Allow-Methods'] = 'POST, GET, OPTIONS'
    headers['Access-Control-Allow-Headers'] = 'X-Requested-With, X-Prototype-Version'
    headers['Access-Control-Max-Age'] = '1728000'
    render :text => '', :content_type => 'text/plain'
  end
end

The before_filter, cors_preflight_check is the last in my filter chain: earlier filters check for allowed access. If the request is for the OPTIONS method, then it short circuits the request, includes the necessary headers and returns a blank text body.

The after_filter, cors_set_access_control_headers, is for all requests that are returned by this controller. It includes the CORS headers for everything else.

Summary

CORS is implemented in all of the popular browsers, but client-side access seems to vary between IE and Safari/Chrome/Firefox. For me, it was interesting to see how server-side access control is not too different from what Adobe does for server-side policy files. It would be nice if these access controls could be unified, but I’m just happy to have them.

Ruby on Rails and Verification

I’ve spent the last few months learning Ruby and using Rails.  Ruby is a dynamic, interpreted language with introspection and a rich syntax for expressing things concisely.  Rails is a web-framework that helps people create industrial-strength web sites.   Both the language and the framework are developed-by and developed-for programmers.  I am very comfortable in this environment!

ruby_on_rails_logo

In a previous life, I worked in Logic Verification for semiconductors.  In the mainstream and over the last decade, Verification evolved from a process in which engineers used simulators to look at waveforms, to self-checking test-benches, to test-generators using randomness to sample a space of possible device behaviors.  It’s interesting to notice how the Rails community has embraced “tests” and “specs” as a fundamental part of the development process.  Good Ruby programmers check-in code with accompanying unit-tests.  And because Ruby has standard packages to express these tests, it is pretty easy to do.  Standard Rails packages extend these ideas to the testing of web sites.  Here, “controller tests” correspond roughly to  “block-level” verification, and “integration tests” correspond rougly to “system-level” verification.

There are other similarities.  The “stimulus” of a logic test is roughly an “HTTP Request” and instead of a “simulator” we’re testing a “web-server.”  Logic “responses” are “web-pages.”  Here is where the comparisons end, however.  Most Web 2.0 web-pages have Javascript in them, and thus the response itself is executable.

In some ways the Rails community is way ahead of the Logic Verification community.  Mutations as a way of debugging the test-bench are standard add-ons to Rails.  Cucumber lets Rails developers write specifications in a high-level language — much clearer than anything I saw in Logic.  The ideas of constrained-random-testing have not yet been embraced in the Rails world, as far as I can tell.  But since there’s a Gem for almost everything in Ruby, I wouldn’t be surprised to see it soon.

Ruby On Rails on Google AppEngine

A few nights ago, I went to a talk about using Ruby on Rails (RoR) on Google’s AppEngine.  Python was the first language supported by AppEngine.  Java is the second.  Using RoR on AppEngine requires compiling Ruby to Java bytecode using the JRuby compiler and runtime enviroment.  RoR applications on AppEngine gain the security benefits of running in a JVM, but this also imposes restrictions on what sort of packages the application can use.

App Engine
App Engine

Moving an existing RoR application to AppEngine requires remapping ActiveRecord data from a SQL database to Google’s DataMapper machinery.  This store is basically a hash-map that may be distributed over multiple machines — transparently.  Joins are not possible with this representation.  The speaker made the claim that for data that suits this type of store, queries require time proportional only to the result set size.

To get your code and data up to the server you need to compile it (using JRuby) and upload it (as JAR files).  Managing your static file content and Ruby gems sounded tricky because there isn’t a proper file manager or shell interface.  And it isn’t possible to use gems that have binary components (like RMagick – the Ruby binding for the ImageMagick package).

Performance tuning seemed difficult.  Threading is not supported to interleave transactions.  Rather, new machine instances are spawned based on load; a new instance takes about 30 seconds to get up and running.

It’s an intriguing offer: use Google’s machinery to host applications in the cloud that scale transparently and are free for about up to 500M pages per month.  For me, it seems there are too many layers between the application and the hardware, since you never really get a host: at any given time your application may be running on any number of hosts — anywhere.  It will be interesting to watch this type of cloud computing fight it out in the marketplace with cheap managed virtual hosts and other approaches.