Wednesday, April 14, 2010

On Web Applications, Web Architecture And Resource Identifiers

On Web Applications, Web Architecture And Resource Identifiers

1 On Web Applications, Web Architecture And Resource Identifiers

1.1 Background

As we evolve from a Web of documents (Web 1.0) to a Web of applications (Web 2.0) and eventually Toward 2^W --- Beyond Web 2.0, key underpinnings of Web Architecture such as resource identifiers require careful re-examination. As a member of the W3C's Technical Architecture Group, I have been trying to define Web Architecture in the context of Web applications; a necessary first step toward that goal is to analyze how complex Web applications are implemented on the Web of today.

This article will carefully avoid abstract issues such as Resource vs Representation, URIs vs URLs, etc. - and instead focus on more practical considerations such as:

  1. What is a URI and what can the user expect to do with it?
  2. When dereferencing a URI, what pieces of software does one need to have to retrieve a useful representation of that resource?
  3. Here, useful is defined from the perspective of the end-user. Thus, given a URI to a piece of media on the Web, relevant metadata is necessary but not sufficient to be useful - the user needs to be able to retrieve and play the media stream as well.

1.2 Case Study: BBCiPlayer And BBC Backstage

The British Broadcasting Corporation (BBC) provides streaming access to a large amount of radio and television content via a Web application called BBC iPlayer. In addition, BBC Backstage provides a rich data-oriented API to the underlying dataset in the form of linked data. Additionally, program schedules can be downloaded in a number of presentation independent formats such as XML, JSON and YAML. The remaining sections in this article detail what can (and cannot be done) with the information that is readily available from BBCiPlayer and BBC Backstage. In the process, we observe some design patterns (and anti-patterns) found on today's Web, and their efect on building richer Web applications from Web parts.

1.3 BBC IPlayer

Using the BBC iPlayer Web application requires:

  1. A modern script-enabled browser such as Chrome, Firefox, Safari, or IE.
  2. Browser plugins for media playback, such as Realplayer or Windows Media.
  3. The Adobe Flash plugin for translating playback links on the BBC iPlayer page to their corresponding Realplayer or Windows Media resources.
  4. Appropriate media player plugins based on the user's platform, e.g., Realplayer or Windows Media.

The Web application as implemented provides a rich, interactive visual interface that is sub-optimal for use from other programs.

1.4 BBC Backstage

Given the triple (radio-station, outlet, date) e.g.:

 (radio4, fm, 2010/04/14)
one can retrieve an XML representation of the program schedule using the URL:
 http://www.bbc.co.uk/radio4/programmes/schedules/fm/2010/04/14.xml
as documented on the BBC Backstage site. Alternative serializations such as JSON or YAML can be retrieved by appropriately replacing the .xml extension.

This retrieved schedule contains detailed metadata for each program that is broadcast, including a programme id pid that is used throughout the data store.

The BBCBackstage API assigns a persistent URI to each program of the form:

http://www.bbc.co.uk/iplayer/episode/<pid>
When retrieved, this persistent URI redirects appropriately to the BBC iPlayer page for that program. Note that the media streams for most programs are only available for a week.

As an example of the above, you can retrieve Midnight News from BBC Radio4 for April 14, 2010 by doing:

On the surface, this URL appears to satisfy many of the expectations that users might have:

  1. Plays the relevant media when handed to a Web browser.
  2. Can be bookmarked for later use (modulo the 1 week limit on archived media).
  3. Can be passed around via email?

The final bullet above exposes some of the problems with the current implementation. Note the set of pre-requisites for the BBC iPlayer Web application enumerated earlier; all of these apply to the URI generated above.

1.5 How It Works At Present

It is instructive to turn on HTTP Request/Response tracking in the browser when opening URL http://www.bbc.co.uk/iplayer/episode/b00rw6hf. Here is a brief summary of some of the steps that the browser performs:

  1. Receives an HTTP Response with content-type text/html.
  2. The body of this response is an HTML document that in turn loads a number of JS libraries.
  3. An embed tag in the retrieved HTML page invokes the Flash (shockwave) plugin.
  4. The embedded shockwave player receives several mostly undocumented parameters that pass in details of the enclosing environment.
  5. Once these steps have completed, the browser is automatically redirected to http://www.bbc.co.uk/iplayer/console/b00rw6hf, i.e., the earlier URI is transformed by replacing episode with console.
  6. The HTTP conversation continues, and the browser is eventually sent to http://www.bbc.co.uk/mediaselector/4/mtis/stream/b00rw6g2 which resolves to the realplayer .ram file: http://www.bbc.co.uk//iplayer/aod/playlists/2g/6w/r0/0b/RadioBridge_intl_2300_bbc_radio_fourfm.ram.

Thus, the recipiant of the Midnight News URL would need to implement all of the above transforms (or have access to software that does those computations) in order to effectively consume the media stream that was addressed by the URL.

1.6 Observations

  1. Web applications have gotten more complicated than they need to be: notice the multiple redundant layers between Flash, JS, HTML, and the complex interplay that results during the HTTP conversation between client and server.
  2. Such complex interplay within multiple layers makes RESTful APIs difficult to achieve.
  3. It is possible that the underlying media stream URLs are being intentionally obfuscated. It's hard to imagine anyone wanting to voluntarily inflict the pain inherent in steps 1..6 without a valid reason.
  4. The obfuscation scheme makes it effectively impossible (on the surface) for interfaces other than the BBC iPlayer Web application to play the media.
  5. Note on the surface in the above. As a testament to the robustness of the architecture of the Web, steps 1..6 can be hidden in a computational blackbox that surfaces a reliable URI that can be email.
  6. As an implementation of the above, see this IPlayer Convertor found on the Web.
  7. In addition to providing a simple HTML form that takes a pid and performs the trnaslation that happens during the client/server HTTP conversation, that site offers a persistent URL given a pid.
  8. What's more, the persistent URL offered up by this convertor is guessable given the pid - this in its turn then becomes a RESTful API for accessing BBC media streams given a pid.
  9. Thus, for the BBC Midnight News episode in question, the iPlayer convertor above serves up http://www.iplayerconverter.co.uk//pid/b00rw6hf/r/stream.aspx.
  10. Notice that replacing %s in
    http://www.iplayerconverter.co.uk/pid/%s/r/stream.aspx
    
    in the above with a pid yeilds a persistent URL that can be handed off directly to a media player, where:
    1. The media player supports the codec in use.
    2. The media player supports the underlying streaming protocol, rtsp in this case.

2 Conclusion

So to conclude, let's ask the original question:

  1. Given a URL, what can a user expect to be able to do with it, after having dereferenced the URI?
  2. How does the user discover what software bits he needs in order to consume the received HTTP Response?
  3. In Web 1.0 (Web of documents) the answer was simple --- HTTP Response header Content-Type specified the media type, which in turn specified what the recipiant needed to understand.
  4. A recipiant who only understands mime-type text/html in this example is likely to flee screaming in terror if he makes the mistake of doing Show Source.
  5. We all acknowledge that Show Source helped Web 1.0 succeed.
  6. Q: What is the equivalent of Show Source that will help us collectively take the Web to the next level?

Author: T.V Raman <raman@google.com>

Date: 2010-04-14 Wed

HTML generated by org-mode 6.08c in emacs 23

37 comments:

billcreswell said...

And of course the benefits of this exposure are things like dotSUB and Overstream Captions. (where's the descriptive audio equivalent)?

ChrisCampuzano said...

nice to know you ~........................................

TerryMontanez2289 said...
This comment has been removed by a blog administrator.
雅伯 said...

Well done!......................................................................

dare713llflagg0 said...

keep update, please..bless you!!.........................

WilbertSoseb said...

向著星球長驅直進的人,反比踟躕在峽路上的人,更容易達到目的。...............................................................

韋于倫成 said...

Well done!.......................................................

韋于倫成 said...

祝福你人氣不減ˊˇˋ.........................

政儒 said...

you two make a lovely couple............................................................

展姍展姍 said...

好的開始並不代表會成功,壞的開始並不代表是失敗..........................................................................

reeselane said...

如果成為一支火柴,也要點亮一個短暫的宇宙;如果是一隻烏鴉,也要叫疼閉塞的耳膜。..................................................

huntb said...

累死了…來去看看文章轉換心情~........................................

SadeRa盈君iford0412 said...

Where did you purchase this product?.................................................................

林志宏 said...

pleasure to find such a good artical! please keep update!!.................................................................

致念致念 said...

pleasure to find such a good artical! please keep update!!.................................................................

政翰政翰 said...

人生是故事的創造與遺忘。............................................................

陳智強 said...

向著星球長驅直進的人,反比踟躕在峽路上的人,更容易達到目的。............................................................

懿綺懿綺 said...

Very good stories~~ Thanks for ur sharing~~!!..................................................................

吳婷婷 said...

大師手筆﹐果然不凡............................................................

雲亨雲亨雲亨 said...

好文章就值得回響,如果可以常常看到您的更新,應該是件很幸福的事情~~............................................................

嘉剛青卉 said...

人不可以求其備,必捨其所短,取其所長............................................................

陳佑發 said...

鞋匠能作好鞋子,因為他只做鞋,不做別的。..................................................

黃威宇 said...

耐心是一株很苦的植物,但果實卻很甜美。.......................................................

陳佑發 said...

好的部落格,希望您能繼續堅持!!!............................................................

李威昌v彥霖 said...

文章優,圖片美,就連回文都很有意思~~~............................................................

吳林怡廷佳宇 said...

耐心是一株很苦的植物,但果實卻很甜美。..................................................

馨惠婷裕 said...

Poverty tries friends...................................................................

家唐銘 said...

愛情是盲目的,但婚姻恢復了它的視力。..................................................................

凱v胡倫 said...

河水永遠是相同的,可是每一剎那又都是新的。. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

孫邦柔 said...

時間就是塑造生命的材料。

素郭青素郭青 said...

快樂,是享受工作過程的結果......................................................................

翊翊翊翊張瑜翊翊翊 said...

一個人的價值,應該看他貢獻了什麼,而不是他取得了什麼......................................... ........................

文岳仲君 said...

在莫非定律中有項笨蛋定律:「一個組織中的笨蛋,恆大於等於三分之二。」............................................................

翊翊翊翊張瑜翊翊翊 said...

快樂,是享受工作過程的結果......................................................................

千TatianaCallan惠 said...

你的部落格很棒,我期待更新喔............................................................

SManik said...

Congrats, and best of luck... Freelance Website Designer in Delhi

kah vin said...

Yamaha R15 dan Yamaha R25 Motor Sport Racing dan Kencang