diff options
author | DRC <information@virtualgl.org> | 2012-03-10 11:27:18 -0600 |
---|---|---|
committer | Christian Beier <dontmind@freeshell.org> | 2012-03-11 15:34:50 +0100 |
commit | 97001a7e7bc80b45ac7c86413afa69f2a0300e7f (patch) | |
tree | db8870312846077190bd2b13267179007d99bbcf /common/turbojpeg.h | |
parent | 1078e8a8b050b5b4ebbcb011750f5dd2d8eacc37 (diff) | |
download | libtdevnc-97001a7e7bc80b45ac7c86413afa69f2a0300e7f.tar.gz libtdevnc-97001a7e7bc80b45ac7c86413afa69f2a0300e7f.zip |
Add TurboVNC encoding support.
TurboVNC is a variant of TightVNC that uses the same client/server protocol (RFB version 3.8t),
and thus it is fully cross-compatible with TightVNC and TigerVNC (with one exception, which is noted below.)
Both the TightVNC and TurboVNC encoders analyze each rectangle, pick out regions of solid color to send
separately, and send the remaining subrectangles using mono, indexed color, JPEG, or raw encoding, depending
on the number of colors in the subrectangle. However, TurboVNC uses a fundamentally different selection
algorithm to determine the appropriate subencoding to use for each subrectangle. Thus, while it sends a
protocol stream that can be decoded by any TightVNC-compatible viewer, the mix of subencoding types in this
protocol stream will be different from those generated by a TightVNC server.
The research that led to TurboVNC is described in the following report:
http://www.virtualgl.org/pmwiki/uploads/About/tighttoturbo.pdf.
In summary: 20 RFB captures, representing "common" 2D and 3D application workloads (the 3D workloads were
run using VirtualGL), were studied using the TightVNC encoder in isolation. Some of the analysis features
in the TightVNC encoder, such as smoothness detection, were found to generate a lot of CPU usage with little
or no benefit in compression, so those features were disabled. JPEG encoding was accelerated using
libjpeg-turbo (which achieves a 2-4x speedup over plain libjpeg on modern x86 or ARM processors.) Finally,
the "palette threshold" (minimum number of colors that the subrectangle must have before it is compressed
using JPEG or raw) was adjusted to account for the fact that JPEG encoding is now quite a bit faster
(meaning that we can now use it more without a CPU penalty.) TurboVNC has additional optimizations,
such as the ability to count colors and encode JPEG images directly from the framebuffer without first
translating the pixels into RGB. The TurboVNC encoder compares quite favorably in terms of compression
ratio with TightVNC and generally encodes a great deal faster (often an order of magnitude or more.)
The version of the TurboVNC encoder included in this patch is roughly equivalent to the one found in version
0.6 of the Unix TurboVNC Server, with a few minor patches integrated from TurboVNC 1.1. TurboVNC 1.0
added multi-threading capabilities, which can be added in later if desired (at the expense of making
libvncserver depend on libpthread.)
Because TurboVNC uses a fundamentally different mix of subencodings than TightVNC, because it uses
the identical protocol (and thus a viewer really has no idea whether it's talking to a TightVNC or
TurboVNC server), and because it doesn't support rfbTightPng (and in fact conflicts with it-- see below),
the TurboVNC and TightVNC encoders cannot be enabled simultaneously.
Compatibility:
In *most* cases, a TurboVNC-enabled viewer is fully compatible with a TightVNC server, and vice versa.
TurboVNC supports pseudo-encodings for specifying a fine-grained (1-100) quality scale and specifying
chrominance subsampling. If a TurboVNC viewer sends those to a TightVNC server, then the TightVNC server
ignores them, so the TurboVNC viewer also sends the quality on a 0-9 scale that the TightVNC server can
understand. Similarly, the TurboVNC server checks first for fine-grained quality and subsampling
pseudo-encodings from the viewer, and failing to receive those, it then checks for the TightVNC 0-9
quality pseudo-encoding.
There is one case in which the two systems are not compatible, and that is when a TightVNC or TigerVNC
viewer requests compression level 0 without JPEG from a TurboVNC server. For performance reasons,
this causes the TurboVNC server to send images directly to the viewer, bypassing Zlib. When the
TurboVNC server does this, it also sets bits 7-4 in the compression control byte to rfbTightNoZlib (0x0A),
which is unfortunately the same value as rfbTightPng. Older TightVNC viewers that don't handle PNG
will assume that the stream is uncompressed but still encapsulated in a Zlib structure, whereas newer
PNG-supporting TightVNC viewers will assume that the stream is PNG. In either case, the viewer will
probably crash. Since most VNC viewers don't expose compression level 0 in the GUI, this is a
relatively rare situation.
Description of changes:
configure.ac
-- Added support for libjpeg-turbo. If passed an argument of --with-turbovnc, configure will now run
(or, if cross-compiling, just link) a test program that determines whether the libjpeg library being
used is libjpeg-turbo. libjpeg-turbo must be used when building the TurboVNC encoder, because the
TurboVNC encoder relies on the libjpeg-turbo colorspace extensions in order to compress images directly
out of the framebuffer (which may be, for instance, BGRA rather than RGB.) libjpeg-turbo can optionally
be used with the TightVNC encoder as well, but the speedup will only be marginal (the report linked
above explains why in more detail, but basically it's because of Amdahl's Law. The TightVNC encoder
was designed with the assumption that JPEG had a very high CPU cost, and thus JPEG is used only sparingly.)
-- Added a new configure variable, JPEG_LDFLAGS. This is necessitated by the fact that libjpeg-turbo
often distributes libjpeg.a and libjpeg.so in /opt/libjpeg-turbo/lib32 or /opt/libjpeg-turbo/lib64,
and many people prefer to statically link with it. Thus, more flexibility is needed than is provided
by --with-jpeg. If JPEG_LDFLAGS is specified, then it overrides the changes to LDFLAGS enacted by
--with-jpeg (but --with-jpeg is still used to set the include path.) The addition of JPEG_LDFLAGS
necessitated replacing AC_CHECK_LIB with AC_LINK_IFELSE (because AC_CHECK_LIB automatically sets
LIBS to -ljpeg, which is not what we want if we're, for instance, linking statically with libjpeg-turbo.)
-- configure does not check for PNG support if TurboVNC encoding is enabled. This prevents the
rfbSendRectEncodingTightPng() function from being compiled in, since the TurboVNC encoder doesn't
(and can't) support it.
common/turbojpeg.c, common/turbojpeg.h
-- TurboJPEG is a simple API used to compress and decompress JPEG images in memory. It was originally
implemented because it was desirable to use different types of underlying technologies to compress
JPEG on different platforms (mediaLib on SPARC, Quicktime on PPC Macs, Intel Performance Primitives, etc.)
These days, however, libjpeg-turbo is the only underlying technology used by TurboVNC, so TurboJPEG's
purpose is largely just code simplicity and flexibility. Thus, since there is no real need for
libvncserver to use any technology other than libjpeg-turbo for compressing JPEG, the TurboJPEG wrapper
for libjpeg-turbo has been included in-tree so that libvncserver can be directly linked with libjpeg-turbo.
This is convenient because many modern Linux distros (Fedora, Ubuntu, etc.) now ship libjpeg-turbo as
their default libjpeg library.
libvncserver/rfbserver.c
-- Added logic to check for the TurboVNC fine-grained quality level and subsampling encodings and to
map Tight (0-9) quality levels to appropriate fine-grained quality level and subsampling values if
communicating with a TightVNC/TigerVNC viewer.
libvncserver/turbo.c
-- TurboVNC encoder (compiled instead of libvncserver/tight.c)
rfb/rfb.h
-- Added support for the TurboVNC subsampling level
rfb/rfbproto.h
-- Added constants for the TurboVNC fine quality level and subsampling encodings as well as the rfbTightNoZlib
constant and notes on its usage.
Diffstat (limited to 'common/turbojpeg.h')
-rw-r--r-- | common/turbojpeg.h | 255 |
1 files changed, 255 insertions, 0 deletions
diff --git a/common/turbojpeg.h b/common/turbojpeg.h new file mode 100644 index 0000000..6e3e259 --- /dev/null +++ b/common/turbojpeg.h @@ -0,0 +1,255 @@ +/* Copyright (C)2004 Landmark Graphics Corporation + * Copyright (C)2005, 2006 Sun Microsystems, Inc. + * Copyright (C)2009-2011 D. R. Commander + * + * This library is free software and may be redistributed and/or modified under + * the terms of the wxWindows Library License, Version 3.1 or (at your option) + * any later version. The full license is in the LICENSE.txt file included + * with this distribution. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * wxWindows Library License for more details. + */ + +#if (defined(_MSC_VER) || defined(__CYGWIN__) || defined(__MINGW32__)) \ + && defined(_WIN32) && defined(DLLDEFINE) +#define DLLEXPORT __declspec(dllexport) +#else +#define DLLEXPORT +#endif + +#define DLLCALL + + +/* Subsampling */ +#define NUMSUBOPT 4 + +enum {TJ_444=0, TJ_422, TJ_420, TJ_GRAYSCALE}; +#define TJ_411 TJ_420 /* for backward compatibility with VirtualGL <= 2.1.x, + TurboVNC <= 0.6, and TurboJPEG/IPP */ + + +/* Flags */ +#define TJ_BGR 1 + /* The components of each pixel in the source/destination bitmap are stored + in B,G,R order, not R,G,B */ +#define TJ_BOTTOMUP 2 + /* The source/destination bitmap is stored in bottom-up (Windows, OpenGL) + order, not top-down (X11) order */ +#define TJ_FORCEMMX 8 + /* Turn off CPU auto-detection and force TurboJPEG to use MMX code + (IPP and 32-bit libjpeg-turbo versions only) */ +#define TJ_FORCESSE 16 + /* Turn off CPU auto-detection and force TurboJPEG to use SSE code + (32-bit IPP and 32-bit libjpeg-turbo versions only) */ +#define TJ_FORCESSE2 32 + /* Turn off CPU auto-detection and force TurboJPEG to use SSE2 code + (32-bit IPP and 32-bit libjpeg-turbo versions only) */ +#define TJ_ALPHAFIRST 64 + /* If the source/destination bitmap is 32 bpp, assume that each pixel is + ARGB/XRGB (or ABGR/XBGR if TJ_BGR is also specified) */ +#define TJ_FORCESSE3 128 + /* Turn off CPU auto-detection and force TurboJPEG to use SSE3 code + (64-bit IPP version only) */ +#define TJ_FASTUPSAMPLE 256 + /* Use fast, inaccurate 4:2:2 and 4:2:0 YUV upsampling routines + (libjpeg and libjpeg-turbo versions only) */ + + +typedef void* tjhandle; + +#define TJPAD(p) (((p)+3)&(~3)) +#ifndef max + #define max(a,b) ((a)>(b)?(a):(b)) +#endif + + +#ifdef __cplusplus +extern "C" { +#endif + +/* API follows */ + + +/* + tjhandle tjInitCompress(void) + + Creates a new JPEG compressor instance, allocates memory for the structures, + and returns a handle to the instance. Most applications will only + need to call this once at the beginning of the program or once for each + concurrent thread. Don't try to create a new instance every time you + compress an image, because this may cause performance to suffer in some + TurboJPEG implementations. + + RETURNS: NULL on error +*/ +DLLEXPORT tjhandle DLLCALL tjInitCompress(void); + + +/* + int tjCompress(tjhandle j, + unsigned char *srcbuf, int width, int pitch, int height, int pixelsize, + unsigned char *dstbuf, unsigned long *size, + int jpegsubsamp, int jpegqual, int flags) + + [INPUT] j = instance handle previously returned from a call to + tjInitCompress() + [INPUT] srcbuf = pointer to user-allocated image buffer containing RGB or + grayscale pixels to be compressed + [INPUT] width = width (in pixels) of the source image + [INPUT] pitch = bytes per line of the source image (width*pixelsize if the + bitmap is unpadded, else TJPAD(width*pixelsize) if each line of the bitmap + is padded to the nearest 32-bit boundary, such as is the case for Windows + bitmaps. You can also be clever and use this parameter to skip lines, + etc. Setting this parameter to 0 is the equivalent of setting it to + width*pixelsize. + [INPUT] height = height (in pixels) of the source image + [INPUT] pixelsize = size (in bytes) of each pixel in the source image + RGBX/BGRX/XRGB/XBGR: 4, RGB/BGR: 3, Grayscale: 1 + [INPUT] dstbuf = pointer to user-allocated image buffer that will receive + the JPEG image. Use the TJBUFSIZE(width, height) function to determine + the appropriate size for this buffer based on the image width and height. + [OUTPUT] size = pointer to unsigned long that receives the size (in bytes) + of the compressed image + [INPUT] jpegsubsamp = Specifies either 4:2:0, 4:2:2, 4:4:4, or grayscale + subsampling. When the image is converted from the RGB to YCbCr colorspace + as part of the JPEG compression process, every other Cb and Cr + (chrominance) pixel can be discarded to produce a smaller image with + little perceptible loss of image clarity (the human eye is more sensitive + to small changes in brightness than small changes in color.) + + TJ_420: 4:2:0 subsampling. Discards every other Cb, Cr pixel in both + horizontal and vertical directions + TJ_422: 4:2:2 subsampling. Discards every other Cb, Cr pixel only in + the horizontal direction + TJ_444: no subsampling + TJ_GRAYSCALE: Generate grayscale JPEG image + + [INPUT] jpegqual = JPEG quality (an integer between 0 and 100 inclusive) + [INPUT] flags = the bitwise OR of one or more of the flags described in the + "Flags" section above + + RETURNS: 0 on success, -1 on error +*/ +DLLEXPORT int DLLCALL tjCompress(tjhandle j, + unsigned char *srcbuf, int width, int pitch, int height, int pixelsize, + unsigned char *dstbuf, unsigned long *size, + int jpegsubsamp, int jpegqual, int flags); + + +/* + unsigned long TJBUFSIZE(int width, int height) + + Convenience function that returns the maximum size of the buffer required to + hold a JPEG image with the given width and height + + RETURNS: -1 if arguments are out of bounds +*/ +DLLEXPORT unsigned long DLLCALL TJBUFSIZE(int width, int height); + + +/* + tjhandle tjInitDecompress(void) + + Creates a new JPEG decompressor instance, allocates memory for the + structures, and returns a handle to the instance. Most applications will + only need to call this once at the beginning of the program or once for each + concurrent thread. Don't try to create a new instance every time you + decompress an image, because this may cause performance to suffer in some + TurboJPEG implementations. + + RETURNS: NULL on error +*/ +DLLEXPORT tjhandle DLLCALL tjInitDecompress(void); + + +/* + int tjDecompressHeader2(tjhandle j, + unsigned char *srcbuf, unsigned long size, + int *width, int *height, int *jpegsubsamp) + + [INPUT] j = instance handle previously returned from a call to + tjInitDecompress() + [INPUT] srcbuf = pointer to a user-allocated buffer containing a JPEG image + [INPUT] size = size of the JPEG image buffer (in bytes) + [OUTPUT] width = width (in pixels) of the JPEG image + [OUTPUT] height = height (in pixels) of the JPEG image + [OUTPUT] jpegsubsamp = type of chrominance subsampling used when compressing + the JPEG image + + RETURNS: 0 on success, -1 on error +*/ +DLLEXPORT int DLLCALL tjDecompressHeader2(tjhandle j, + unsigned char *srcbuf, unsigned long size, + int *width, int *height, int *jpegsubsamp); + +/* + Legacy version of the above function +*/ +DLLEXPORT int DLLCALL tjDecompressHeader(tjhandle j, + unsigned char *srcbuf, unsigned long size, + int *width, int *height); + + +/* + int tjDecompress(tjhandle j, + unsigned char *srcbuf, unsigned long size, + unsigned char *dstbuf, int width, int pitch, int height, int pixelsize, + int flags) + + [INPUT] j = instance handle previously returned from a call to + tjInitDecompress() + [INPUT] srcbuf = pointer to a user-allocated buffer containing the JPEG image + to decompress + [INPUT] size = size of the JPEG image buffer (in bytes) + [INPUT] dstbuf = pointer to user-allocated image buffer that will receive + the bitmap image. This buffer should normally be pitch*height + bytes in size, although this pointer may also be used to decompress into + a specific region of a larger buffer. + [INPUT] width = width (in pixels) of the destination image + [INPUT] pitch = bytes per line of the destination image (width*pixelsize if + the bitmap is unpadded, else TJPAD(width*pixelsize) if each line of the + bitmap is padded to the nearest 32-bit boundary, such as is the case for + Windows bitmaps. You can also be clever and use this parameter to skip + lines, etc. Setting this parameter to 0 is the equivalent of setting it + to width*pixelsize. + [INPUT] height = height (in pixels) of the destination image + [INPUT] pixelsize = size (in bytes) of each pixel in the destination image + RGBX/BGRX/XRGB/XBGR: 4, RGB/BGR: 3, Grayscale: 1 + [INPUT] flags = the bitwise OR of one or more of the flags described in the + "Flags" section above. + + RETURNS: 0 on success, -1 on error +*/ +DLLEXPORT int DLLCALL tjDecompress(tjhandle j, + unsigned char *srcbuf, unsigned long size, + unsigned char *dstbuf, int width, int pitch, int height, int pixelsize, + int flags); + + +/* + int tjDestroy(tjhandle h) + + Frees structures associated with a compression or decompression instance + + [INPUT] h = instance handle (returned from a previous call to + tjInitCompress() or tjInitDecompress() + + RETURNS: 0 on success, -1 on error +*/ +DLLEXPORT int DLLCALL tjDestroy(tjhandle h); + + +/* + char *tjGetErrorStr(void) + + Returns a descriptive error message explaining why the last command failed +*/ +DLLEXPORT char* DLLCALL tjGetErrorStr(void); + + +#ifdef __cplusplus +} +#endif |