Yesterday I attended the ja.net Arts and Humanities Streaming Workshop at the Royal College of Music. Basically the idea of the day was to showcase a number of technologies that make use of ja.net’s high bandwidth NREN backbone.
The event included an impressive demo from the originators of the Low Latency Audio Visual Streaming System (LOLA). The demo was presented as a live performance between a pianist performing the media room at RCM (the event venue), and a Clarinettist performing at Napier University in Edinburgh. The Clarinettist’s sound was played to the audience through a pair of small near-field monitors, and a live video feed was projected onto a screen at the front of the venue.
The sound quality was excellent, and there was no significant musical difficulty related to network latency. When asked in the Q and A session afterwards, the Clarinettist reported that initially, he noticed some difference compared to a same-room performance, but he soon got used to it. My guess is that the performers adjusted to the slight latency (<50ms) and compensated for it accordingly.
However, there was a noticeable difference (from an audience perspective) in terms of the spatial “presence” of the Clarinet. Although the performer was moving around as part of the gestural expression of the piece, the audio was captured through a single microphone, with fixed pan-position in the room. This, coupled with a lack of dispersion from the instrument body and natural room response created ear fatigue after several minutes of listening.
I also strongly missed both the performer-audience interaction and performer-performer interaction we normally get in a concert. This is usually subtle (eye contact, nuances of gesture and posture etc), but important in maintaining focus and interest. The overall effect was of observing a studio recording session.
LOLA itself, is an audio codec for putting audio from a general-purpose computer (e.g. laptop) on a network, with minimal processing delay (latency). It is very similar to CCRMA’s JackTrip project, and aside from some usability nuances, I struggled to understand the key advantages of LOLA over JackTrip. Further research into these systems including quantitive tests and QoS benchmarks are needed, with peer-reviewed publication as a next step.
To achieve the kind of impressive performance witnessed in the LOLA demo, all of the variables in a fairly complex stack of hardware and software need to be finely tuned. The system needs to be run on a high-speed, high-bandwidth link with minimal round-trip latency. There is also a threshold on the physical geographical distance between the locations involved. The current quoted figure is 1ms network latency per 100km, so whilst the CODEC latency is fixed, the network latency resulting from cable distance is an important variable. Additionally, the microphone and speaker setup needs to be carefully tuned, as does the room itself, including the use of diffusers to avoid noticeable room echo. Finally, a music-optimised echo cancellation system such as Brian Shepard’s EchoDamp needs to be used and finely tweaked. Brian demo’d his software during the event, and whilst it was impressive, I suspect there is much research to be done in this area also.
In the coffee break, discussion tended focus on possible applications for LOLA, and audio/video streaming in general. The general impression I got was that most of the delegates weren’t convinced by the notion of a videoconference-type setup being used in a live performance as a “substitute” for physical performer presence. The RCM have been using videoconferencing for delivering and receiving one-to-one instrumental tuition, and this seems to be one of the most promising areas for future development. I can also think of many niche “artistic” applications. However, the needs of these applications could be fulfilled by existing systems such as DVTS or Conference XP, and wouldn’t specifically require super-low roundtrip latency. 100ms is perfectly acceptable for delivering a Violin lesson, for example.
In conclusion, I’d say that it will only be a matter of time, before low latency audio and video streaming systems are commonplace. The question is, how can we use it to add value to what we do? The first talk in yesterday’s workshop focused on the non-technical aspect of this - what is the relationship between the “medium” and the “message”. I think as arts researchers, we should be asking not only how we can deliver existing experiences through a new medium, but also what new possibilities does the medium offer that we hadn’t previously considered…