Let us look first at a conventional JPEG-based digital transmission system with which most people are familiar. Let us presume that it captures images at 640 x 480 pixels (a simple number for calculation's sake). A conventional PAL video frame consists of two interlaced fields, so a full frame JPEG image would contain 2 x 640 x 480 pixels or 614 400 pixels.
Each pixel could be described by anything from 8 to 24 bits per pixel (or even 32 bits) in terms of colour information. 256 greyscales for example is the equivalent of 8 bits per pixel. 16 bits translates to 65 536 colours, 24 bits to 16 million colours and 32 bits to 4,29 billion colours (typically described as true colour).
If this single JPEG frame (which consists of 614 400 pixels) is captured and described at 16 bits per pixel (65 536 colours), then the resultant discrete image would total 9 830 400 bits of information, or 1,23 MB information. A single field capture, rather than the combined full frame capture described above, could reduce this to 614,4 kbyte.
From here things get a little easier. JPEG compression algorithms can be manipulated to compress this image by 5, 10, 20 or even 50 times. The reality though, is that due to JPEG's compression algorithms, visual artifacts (or 'blockiness') are introduced as compression ratios are increased. Whilst this may not be noticeable at 10x compression, it most certainly would be at 50x. Basically this means that the single field JPEG frame can be reduced to probably no more than 20-30 kbyte whilst remaining a reasonable replication of the original image. Bear in mind that this is a full sized image (640 x 480 pixels).
It is easy to see that over a PSTN line, the maximum transfer rate of which is 56 kbyte/s, that full sized images (20-30 kbyte) are never going to be transferred at frame rates of 10, 15 or 20 frames/s.
But this is the crucial point. Many sophisticated systems do not transmit full resolution, full frame images! They store them yes, but they transmit lower resolution images for rapid updates, and a degree of visual verification.
Visual verification vs remote monitoring
This brings me back to the age old debate in terms of digital video surveillance. On the one hand there are applications that require visual verification, and on the other hand those that require remote monitoring. Visual verification typically is an event driven application, which needs to have high resolution image capture and storage capabilities with pre-event image storage features, all at the remote site. Image transmission times are not crucial, because of the image storage capability at the remote site. Most importantly, images stored are high resolution images which can be retrieved after the event and be effectively used to identify culprits. Typically, these systems have been full frame JPEG-based systems.
Remote monitoring, on the other hand, is less concerned with image storage and identification and more concerned with rapid image update and the perception of obtaining real time visuals from remote CCTV cameras. Typically these systems have been based on conditional refresh JPEG-based systems, in some cases MPEG based systems, and a few even use the latest video conferencing standards.
In the latter applications (remote monitoring) high update rates are usually afforded by clever conditional refresh techniques, and individual frames can be a low as a few KB. In such cases, transfer rates of 10, or even 15 frames/s are quite achievable on PSTN lines. But it is clear that a 'frame' in the first instance (a high resolution (640 x 480 pixel) full frame (two fields), 16 bit, 30 kbyte JPEG image) is a far cry from its lesser counterpart in the second instance (a low resolution (perhaps 320 240 pixel), single field, 16 bit, conditional refresh, 1-5 kbyte).
So, your frame is almost never the same as my frame. Which means frame rates can really be quite misleading if used without qualification.
Of course, we are now also seeing fractal and Wavelet-based systems starting to enter the market and these can justifiably claim to offer compression ratios an order of magnitude (10x) and more than those offered by conventional JPEG and MPEG-based systems. But whilst massivley increased compression does aid image storage and transmission specifications, that is only one aspect of what can be very sophisticated systems. It does not intrinsically make them 'better' systems.
Where to from here
So where does this all leave us? Well, I would guess end-users could be forgiven for holding their heads in their hands and saying 'enough!' The truth of the matter though, is that we do not give end-users the credit they deserve. They know what works, and they know what doesn't.
Let me explain. As consumers, we all thought that the early video on CD technology was 'interesting,' 'novel,' 'cool' and all those things, but there was no way we were going to watch anything on a PC monitor when it was a pixilated, jerky video clip about the size of a postage stamp. (funnily enough, did you ever stop to think that video on the web today is about as useful as video on CD was a few years ago?).
Today of course, DVD and video compression technology has advanced to the stage that we can watch the latest blockbuster in full-screen glory, right there on our PCs.
And it is very, very impressive.
My point is that as end-users, we know instinctively when what we see works, and also when it doesn't. Too many of the current digital video recording and transmission systems today are still floored (or should that be flawed) by this yard-stick, when what we see and what we are told does not measure up.
This 'instinctive' feel for the worth of a system does not preclude us from taking responsibility for asking the right questions, not does it imply that the security industry should blythely expect end-users to make good judgement calls on the 'value of a prticular system to their own specific application requirements'.
On the contrary. End-users know what their problems are, and should be able to expect professional security technology technical staff to advise them appropriately. All to often, the industry professes to have the client's interests at heart, but insists that its solution is the answer to every problem - a jack of all trades, master of none. They find it difficult to walk away from a potential customer knowing that they gave them the right advice (for the client), but that it cost them the deal.
So, in the final analysis, expect to see more JPEG, MPEG, fractal, Wavelet and video conferencing based video recording/transmission systems on the market. Whilst specifications can be important (storage capacity and time for instance), the differing technologies are such that comparisons are of limited value.
From a technical point of view, I think the key issues are influenced more by questions such as robusteness, interoperability, the availability of cost-effective storage mediums, communications modules, the ability to offer system integration and customisation, and the track record of the company offering the technology, from design and manufacture all the way through to after sales support.
From the clients point of view, the issue really boils down to defining the problem correctly and ensuring the solution is a well considered, well thought-out solution, not a black-box fits all approach. For the security industry, where IT has only just started to play a role, the fun starts now!
© Technews Publishing (Pty) Ltd. | All Rights Reserved.