Web Application Development and Design: HTTP Web application

Showing posts with label HTTP Web application. Show all posts

Friday, July 25, 2008

GWT and RESTlets

I'm back, with some big news!

RESTlet ported to GWT!

I have been busy with a large-scale, enterprise AJAX Web application, and I'm happy to say I've had to write at most 200 lines of Javascript; instead, we have about 80,000 lines of Java. Most of you probably know why ... we're using Google Web Toolkit (GWT) (+ RESTlets), and thought I would share my love for these libraries. (Before we delve deep into my ramblings, I'd like to point out there other libraries that rival RESTlet's fantastic-ness, such as Jersey, but I've not used them extensively enough to write about them, so feel free to post your links to other blogs!)

If you haven't heard of GWT or just haven't taken the time to check it out, you really, really should [ http://code.google.com/p/googlewebtoolkit/ ]. In short, Google developed a compiler that creates Javascript bytecode from Java source, so you develop your Client-side code just as if you're developing a Java app (don't worry, custom Javascript can be invoked using JSNI), and Google does the rest of the work.

If you're going to build both the client-side and server-side bits, GWT has some built-in support for you. It provides RPC mechanisms, as well as ways to make plain ol' HTTP requests. First up: RPC! Simply explained, an RPC-based application is built using just a few pieces and parts - an interface so the client knows what methods it can invoke and an implementation of these methods on the server. Having used it, though, the RPC is a bit arduous to maintain, clunky to build, and can be slow as molasses. But it works, has the benefit of making the client-side code easy to read, and methods more intuitive to invoke.

Now, if you want to provide server-side functionality via HTTP requests instead of RPC calls, the RequestBuilder class is what you want. It has a simple and relatively robust API, but has its drawbacks. If I had to try to pick a fight, I'd say the drawbacks are that it does require a bit of knowledge about the lower-level workings of HTTP requests (e.g. header syntax), and doesn't provide a quick way to get XML from the Response object in a Document format, but there aren't any really pressing issues that I've encountered.

So enough about GWT - what is this RESTlet stuff? Well, it's based on a particular architectural style, developed in the wonderful world of Acadaemia (don't run away now): Representational State Transfer (REST). It's just a way of designing how to manage and access resources, and once you figure it out, you'll notice a lot of the Internet basically works this way already. Let's say you're building a store that sells T-shirts of many colors. A RESTful way of modelling your store-front would be to provide the resources of your store, e.g. the T-shirts, in a representational way. So to see a representation of all T-shirts sold at the store, one might visit the url http://www.tshirtstore.com/storefront/tshirts. To see a representation of all blue T-shirts, one could visit /storefront/tshirts/blue. To visit the cart, one could go to /storefront/cart. These resources can be delivered as XML, XHTML, JSON, binary, plain-text ... you name it. Each URL basically represents a resource and ways to get to other resources, if applicable. Intuitive? I tend to think so.

By the by, my colleague explains it well (i.e. better) in non-example-based terminology, if you'd like to see it go here: http://gogocarl.blogspot.com/

In any case, REST is a wonderful way to expose resources and functionality to your client code in an intuitive way, and the greatest of all is that it highly decouples your client and server code bases, so if you deliver your resources in a client-independent way (e.g. XML), you can show off your resources in many different ways.

If you'd like to create a GWT + RESTlet application, check out the RESTlet-GWT module [ http://blog.noelios.com/2008/07/25/restlet-ported-to-gwt/ ]. If you give it a try let me know how it goes! I've had such good luck with it I'm rather skewed towards it, but I'm still a bit of a n3wb Web app developer, and I'd be very interested to know what your opinions are.

Wednesday, October 24, 2007

Relational Data as XML

Everything is fodder for argument with XML. Note these 50 pages of point-counterpoint that discuss nearly every aspect of XML's use and usability - it's a pretty quick read [http://www.zapthink.com/report.html?id=ZT-XMLPROCON].

The short of this entry is that storing data on the server-side as XML documents is a very flexible, readable, maintainable, and most importantly scalable option for a Web application. Document size issues that tend to accompany XML-based storage can be avoided by leveraging the size of documents vs. external contextual data (i.e. indexing information). Performance and scalability for even the most demanded applications can be as outstanding as any highly-trafficked Web site, if the metaphor of serving your data in XML documents as Web pages can be kept in mind.

The traditional rule of thumb is to put "data" in a database and store documents on a file system. But breaking up a document's worth of data to shove into a database is not a difficult task, and composing a table or two into a document of information is also not difficult. Where things become complicated is when mounds of contextual metadata is generated to cope with relational data. Separating contextual data from content can give you immense power over what you serve, when you serve it, and how you manage it.

-----

There are many time-sinks and concerns when using XML to describe your data, a few of which are engineering the documents' structures, handling data serialization, performance issues, and information bloat. Before you can massage your data into XML, you need to know what structure the document should be - sometimes this is not trivial, especially if representing relational data, and will take time. Taking into account the hitch that binary serialization is brittle and not always platform/hardware independent, having to massage your data so it can be serialized as String representations may also take time; complex objects need to be broken up into primitives to be represented nicely.

Once this is finished, immediately one might notice that some XML representations are thousands of lines long (I made this wonderful mistake more than once), slowing your parser and increasing the transfer time towards O(eternity); the more complex the object, the more contextual descriptions and metadata will be needed to describe the object. Ugh.

Since there is (obviously) a connection between the size of your XML document and the time it takes to parse its content, a good way to increase performance is to break your data into finer levels of granularity. For instance, if you have a document representing the furniture in your house by room, you can easily break the document up into many documents, each representing one room.

With each room separated out, a natural idea would be to add an additional tag under the "room" parent node (or as an attribute) that identifies which house it belongs to. If we break things down even farther, and separate out each piece of furniture into its own document, we would need to add a tag that identifies which room each piece of furniture belongs in.

This method of "contextualizing" XML documents has a problem. Since the information is internal to the document, each time the context of a document needs to be determined it must be parsed and searched. Additionally, this method only allows the "child" document to know who its "parent" is - a house would not know what furniture it has, only the furniture would know which house it belongs to. There are at least two ways to solve this, and the first is obvious - after breaking up a large document, build a "virtual" document that has references to its pieces and parts. This is not a terrible way of doing things - it can, however, lead to an enormous amount of files.

The alternative is to externalize all "virtual resources" into one document, namely a large indexing file. So if multiple house documents - "FooHouse" and "BarHouse" - have each of its rooms stored as separate files, a master document will subsume the identifiers for both room documents under their particular identifiers. When a user requests the resource "FooHouse", the master document (which is assumedly kept in memory for quick traversal) will either assemble the document, or - assuming a screen could not handle showing every room's contents all at once - simply serve each room document when requested.

This solution can scale well, requires no particular storage model (the master document could refer to tables in a database, files on a disk or even a URL) and allows a conceptual resource to be as complex as needed. It also allows for performance tuning, as the client can receive only the portions of the resource pertinent to their particular task.

Tuesday, October 23, 2007

It's the Internet, son

If you're reading this, you most likely know what TCP is. You also might know what UDP is, and what the differences between the two are. If you don't, read a little of the Wikipedia entry on the Internet [http://en.wikipedia.org/wiki/Internet], and skim the links about these network protocols. The Internet is the embodiment of these standards, and this is where we will start.

Web applications that we are concerned with are first and foremost applications, layers of software that provide some sort of service to a person or group of people. A Web application is only special insofar as it is deployed on the Internet and is accessible by contacting a hosting server via some sort of network browser.

"Right thing" [http://www.jwz.org/doc/worse-is-better.html] advocates would say the goal of a Web application is to provide the client the illusion that the application is native to their machine, providing seamless access to data and functionality that is, in actuality, housed on a set of machines somewhere far across the Internets. This is admirable, and should be the goal of all Web developers. They might continue on to say the interface should be simple, consistent, and complete at all costs, even if the implementation of the application suffers from complexity.

Of course, there are many open questions when we descend from our 10,000 ft. goal - if our data set is large, how do we serve that across millions of miles of cable without the user waiting for it? If the user has a high-latency satellite connection to the Internet, how do we get around her experiencing round-trip-times in the seconds? How do we provide uniform access to resources across disparately performant physical mediums? ("Right thing" advocates are probably too busy maintaining, debugging and securing their RPC stub generators to answer these questions, so I'll do it for them.)

The obvious answer is that you can probably spend all of the world's software contractors' budgets combined and never be able to do things the "right way." The interface will never be simple enough to completely obfuscate the idea that a Web application is deployed on the Internet. The less obvious answer is that we can still do things simply, chiefly because the Internet already has mechanisms in place to do this for us - remember TCP and UDP? They're the ones that were used by HTTP to serve this very Web page, this (hopefully X)HTML document. They are the Postal Service of the Internet - with the power of the OSI model layers [http://en.wikipedia.org/wiki/OSI_model] combined, your documents can be delivered to and fetched from your clients the best way the Internet knows how - GETs, POSTs and PUTs.

So why shouldn't your Web application make use of this document delivery service?

If your data is already in or is easily convertible to a Web friendly XML or JSON format, you are in business. Boot up a Web server to serve up those documents, create a client-side application that can communicate via GETs, POSTs and PUTs (GWT [http://code.google.com/webtoolkit/] is a good, AJAX-y option) and you have an architecture that lays flat against the original design principles of the Internet.

Of course, you can optimize the Web server by designing your own file system to version, cache, and pre-fetch, your documents. Or if you're less adventurous, you can download a lightweight servlet framework that can do much of this for you [http://www.restlet.org/].

This discussion is far from over, of course. The next post will discuss the benefits and drawbacks of maintaining data as XML documents, and how fine-tuning the granularity of contextual metadata will determine the performance of your application.

Web Application Development and Design