Representing data

All data inside a computer is transmitted as a series of electrical signals that are either on or off. Therefore, in order for a computer to be able to process any kind of data, including text, images and sound, they must be converted into binary form. If the data is not converted into binary – a series of 1s and 0s – the computer will simply not understand it or be able to process it.

Before a computer can understand any information, it must first be converted into binary. Audio, video, images or written text must be converted from their original formats into binary code.

Representing text

When any key on a keyboard is pressed, it needs to be converted into a binary number so that it can be processed by the computer and the typed character can appear on the screen.

The letter "A" on a keyboard converts to 01000001 in binary

A code where each number represents a character can be used to convert text into binary. One code we can use for this is called ASCII. The ASCII code takes each character on the keyboard and assigns it a binary number. For example:

  • the letter ‘a’ has the binary number 0110 0001 (this is the denary number 97)
  • the letter ‘b’ has the binary number 0110 0010 (this is the denary number 98)
  • the letter ‘c’ has the binary number 0110 0011 (this is the denary number 99)

Text characters start at denary number 0 in the ASCII code, but this covers special characters including punctuation, the return key and control characters as well as the number keys, capital letters and lower case letters.

ASCII code can only store 128 characters, which is enough for most words in English but not enough for other languages. If you want to use accents in European languages or larger alphabets such as Cyrillic (the Russian alphabet) and Chinese Mandarin then more characters are needed. Therefore another code, called Unicode, was created. This meant that computers could be used by people using different languages.


Representing images

Images also need to be converted into binary in order for a computer to process them so that they can be seen on our screen. Digital images are made up of pixels. Each pixel in an image is made up of binary numbers.

If we say that 1 is black (or on) and 0 is white (or off), then a simple black and white picture can be created using binary.

To create the picture, a grid can be set out and the squares coloured (1 – black and 0 – white). But before the grid can be created, the size of the grid needs be known. This data is called metadata and computers need metadata to know the size of an image. If the metadata for the image to be created is 10x10, this means the picture will be 10 pixels across and 10 pixels down.

This example shows an image created in this way:

Diagram to illustrate pixels and their make-up.

Adding colour

The system described so far is fine for black and white images, but most images need to use colours as well. Instead of using just 0 and 1, using four possible numbers will allow an image to use four colours. In binary this can be represented using two bits per pixel:

  • 00 – white
  • 01 – blue
  • 10 – green
  • 11 – red

While this is still not a very large range of colours, adding another binary digit will double the number of colours that are available:

  • 1 bit per pixel (0 or 1): two possible colours
  • 2 bits per pixel (00 to 11): four possible colours
  • 3 bits per pixel (000 to 111): eight possible colours
  • 4 bits per pixel (0000 – 1111): 16 possible colours
  • 16 bits per pixel (0000 0000 0000 0000 – 1111 1111 1111 1111): over 65 000 possible colours

The number of bits used to store each pixel is called the colour depth. Images with more colours need more pixels to store each available colour. This means that images that use lots of colours are stored in larger files.

Image quality

Image quality is affected by the resolution of the image. The resolution of an image is a way of describing how tightly packed the pixels are.

In a low-resolution image, the pixels are larger so fewer are needed to fill the space. This results in images that look blocky or pixelated. An image with a high resolution has more pixels, so it looks a lot better when you zoom in or stretch it. The downside of having more pixels is that the file size will be bigger.

Representing sound

Sound needs to be converted into binary for computers to be able to process it. To do this, sound is captured - usually by a microphone - and then converted into a digital signal.

An analogue to digital converter will sample a sound wave at regular time intervals. For example, a sound wave like this can be sampled at each time sample point:

Sounds are analogue and their waveforms can take any value.

The samples can then be converted to binary. They will be recorded to the nearest whole number.

Time sample12345678910
Denary8376972666
Binary1000001101110110100101110010010001100110

If the time samples are then plotted back onto the same graph, it can be seen that the sound wave now looks different. This is because sampling does not take into account what the sound wave is doing in between each time sample.

When sampling an analogue waveform, the resulting digital sound wave is not exactly like the original.

This means that the sound loses quality as data has been lost between the time samples. The way to increase the quality and store the sound at a quality closer to the original, is to have more time samples that are closer together. This way, more detail about the sound can be collected, so when it’s converted to digital and back to analogue again it does not lose as much quality.

The frequency at which samples are taken is called the sample rate, and is measured in Hertz (Hz). 1 Hz is one sample per second. Most CD-quality audio is sampled at 44 100 or 48 000 KHz.


Compression

Why compress files?

Processing power and storage space is very valuable on a computer. To get the best out of both, it can mean that we need to reduce the file size of text, image and audio data in order to transfer it more quickly and so that it takes up less storage space.

In addition, large files take a lot longer to download or upload which leads to web pages, songs and videos that take longer to load and play when using the internet.

Compression addresses these issues.

Any kind of data can be compressed. There are two main types of compression: lossy and lossless.

Lossy compression

Lossy compression removes some of a file’s original data in order to reduce the file size. This might mean reducing the numbers of colours in an image or reducing the number of samples in a sound file. This can result in a small loss of quality of an image or sound file.

A popular lossy compression method for images is the JPEG, which is why most images on the internet are JPEG images. A popular lossy compression method for sounds is MP3. Once a file has been compressed using lossy compression, the discarded data cannot be retrieved again.

Lossless compression

Lossless compression doesn’t reduce the quality of the file at all. No data is lost, so lossless compression allows a file to be recreated exactly as it was when originally created.

There are various algorithms for doing this, usually by looking for patterns in the data that are repeated. Zip files are an example of lossless compression.

The space savings of lossless compression are not as good as they are with lossy compression.

Lossy compression removes some of a file’s original data in order to reduce the file size. Lossless compression doesn't reduce the quality of the file at all and no data is lost.

Test Here