The infographic, below, shows the latency of the Audio Everywhere streaming audio system in context.
The two vertical bars in the graphic illustrate the typical end-to-end latency of the Audio Everywhere system with iOS and Android. For Android, we are assuming a high-quality system such as those from Samsung and OS 5 (Lollipop) and better.
Starting from the bottom-left, we see that the latency of the Audio Everywhere system is well within the requirements of simultaneous language translation. People typically quote this requirement as being between one and two seconds.
The Audio Everywhere system is also quite good for TV watching. While some A/V specialists can reliably detect lip sync issues as small as 20 ms, the vast majority of users are perfectly fine if the sound is within about 100 ms. Between 200 and 250 ms, these systems begin to get annoying. Note that in practical real-world situations, the delay in a modern TV actually helps. The TV typically buffers a few frames in order to reconstruct the digitally encoded video, and that delay subtracts off of the audio latency. If a digital distribution network is use, e.g., video over IP, then one actually has to delay the audio to get the lip sync aligned.
The audio in an auditorium travels at the speed of sound. Assuming public address speakers in the front, if one is back a few rows in the auditorium, then the sounds from the PA speakers and the Audio Everywhere system are reasonably in synchrony.
At the 2016 Hearing Loss Association of America (HLAA) conference we asked participants to comment on our latency for lip reading. The conference had, of course, a huge concentration of lip readers. The feedback we received from over half a dozen participants was that the latency of the iPhone was acceptable, but the Android didn’t quite do it. There are scholarly papers that say the less is better, but this was the test we did.
One of the more challenging situations for Wi-Fi Audio is when the sound impinges from two different paths. This is common when there are both public address speakers, where the sound travels through the air, and from the Audio Everywhere system, where the sound travels through the digital network. If the paths are different by as much as 20 ms and the sounds are at the same amplitude, then most people can hear the multipath effect. The key to listening in this situation is to make sure that one path is stronger/louder than the other. In that case, the sound is a little “fuller,” in some situations. In other situations, it is a little like being in a cathedral with strong reverberations. For people with profound hearing loss, the situation is manageable by making sure that the ambient path is off in the hearing aid. For others, an earphone may be required, or they can sit a little further back in the room. It is a manageable issue, but can be annoying if not managed.
We often get questions about using our system as an in-ear monitor for musicians. Short answer is, no, we are nowhere near where we would have to be. Requirements are on the order of a millisecond.
Above the graph we have listed four important times. The first is the timing of a beat for a song with 150 beats per second. This is a fast song, such as Holding out for a Hero (Bonnie Tyler), Crocodile Rock (Elton John), or Jumpin’ Jack Flash (The Rolling Stones). We are a small fraction of a beat for even a fast song. The second time is the time for sound to go 100 meters. You can see that that time is quite long compared with the latency of our system. This could be important in, for instance, an arena or stadium. Next, we show the time for the author to speak one syllable. You can see that the latency of the Audio Everywhere system is a small fraction of a syllable. Finally we show how long it take sound to travel 100 feet, or travel 50 feet, bounce off a wall, and return. You have undoubtedly heard this effect in large rooms. It is similar to the delay when listening to Audio Everywhere with an Android smart phone.