Wednesday, September 29, 2010

Pdf compression

PDF is the abbreviation for Portable Document Format and it is an ideal format, using which data can be distributed easily. It is, as the name suggests, a portable format. PDF files are a combination of text, images, videos, animations and many other components. All these components, when put together, make a very large file. That is the reason for which the PDF files are large in size. However, PDF files are meant for distribution and hence, they should be of smaller sizes. Hence, PDF compression is used.

PDF files mainly consist of physical data and metadata which are both created when the files are created. All components that are embedded in the file are encrypted in the metadata also along with the physical data. Hence, the space consumed by the file is large. Also, when different fonts are used, each font’s image has to be encrypted in the metadata of the PDF file, making the file bulky.

PDF compression basically removes the unnecessary parts of the PDF file, which, either are not used, or are unnecessary. Thus, most of the space that is occupied by junk will be free. The PDF compressor also removes data that occurs repetitively and puts the data in a single place, accessible as many times as necessary, effectively reducing the space occupied.

PDF compression is done by software that can either be downloaded or can be used online. Using downloaded PDF compressors is called offline PDF compression and using the ones on the net is called online PDF compression.

PDF compressors also reduce the size of the file by adjusting the variations in colour in the images in the PDF file. They also reduce the size by reducing the resolution of the image or even by using gray scale of the image. The PDF compression software also reduce the size of the PDF file by reducing the video size by increasing the number of frames per second and also by reducing the resolution of the video.
For the data after and before the PDF compression to be same, lossless compression like Flate, LZW, ZIP, etc can be used.