Compression

Compression while less as important than it used to be with storage becoming cheaper and bandwidth getting larger; compression is still a vital way to transfer files faster and free up storage. For large data warehouses maintained by big companies like google and amazon reducing the data you store and transfer by any amount will reduce the amount of computing power needed and the storage need; therefore, resulting savings (1).  Compression in short is the reduction of space a file uses to store the same information or in some cases approximately the same information. All forms of compression in computer science use some form of encoding and decoding. Meaning that to compress a file or transmission you encode it using a certain algorithm or process to convert the data into a specific format that will still have the ability to be decoded approximately into the original data. Compressing a data file like whenever you zip a file is usually called data compression, but when data is compressed to be transferred it is call source coding.  For both types the devices or processes that compress the data are referred to as the encoder and the devices that uncompress are called the decoder (2). The major trade off of compression is the time complexity; if you are compressing a video to transmit to the user the amount of time it takes to decode could result in a poor user experience. Overall there are many different forms and algorithms for compression that deal with the space time complexity trade off, most with logically designed for certain uses, and we will talk about a particular forms of compression in later blog posts but for now we will talk about the two major categories of compression known as lossy and lossless (2). We will first talk about the more straight forward of the two lossless compression.

Lossless

Lossless compression simply put is just compression that from encoding to decoding does not lose any data. Meaning that if you encode a file than decode a file you get the exact same file with the size and values being exactly the same. This is usually done by taking advantage statistical redundancies in data, for example if you were to compress this blog post you might have far more ‘e’s and ‘i’s than ‘q’s and ‘z’s, or if you have a digital photo of “The Starry Night” you would have a lot more blue green pixels than red and purple. As of now each pixel and letter is represented by the same amount of binary memory but since we have a lot more of a certain type of data, we can encode it so that the more used type of data is now represented by a smaller amount of binary memory (2). This is a more data specific form of compression but allows us to not lose any of the original contents of the data before it was encoded.

Lossy

Whether you know it or not you are probably very familiar with lossy compression. If you’ve ever seen an image that looks extremely blurry on the internet that not necessarily because that image was taken on the first camera phone ever but because it has been shared so many times between websites and people; going through lossy compression over and over again that it has lost a lot of its detail. If you haven’t been able to gather lossy compress is the opposite of lossless where some of the original information is lost when it is compressed (2). Forms of lossy compression are much more complicated and actually usually rely on the psychology, and how the data is perceived using thing like what will be the focal point of an image or what variants of color can the human eye perceive well, or what wavelength of sound can the human ear hear as there is no point in spending extra to stream music so your dog can enjoy it too . In short lossy compression decides what data is important and what data is not important then encodes accordingly.

I hope this blog was interesting to you and gave you a high-level explanation on how compression works in computer science.

Sources:

https://csfieldguide.org.nz/en/chapters/coding-compression/

https://en.wikipedia.org/wiki/Data_compression

Leave a comment

Design a site like this with WordPress.com
Get started