ComputersInformation Technology

Coding textual information on a computer

A computer is a complex device with which you can create, convert and store information. However, the computer does not work quite intelligibly for us - graphical, textual and numerical data are stored as arrays of binary Numbers. In this article, we'll look at how text information is coded.

What is a text for us is a sequence of symbols for a computer. Each symbol represents a specific set of zeros and ones. Under the symbols are meant not only lowercase and capital letters of the Latin alphabet, but also punctuation marks, arithmetic signs, service symbols, special symbols and even a space.

Binary coding of textual information

When a certain key is pressed, an electrical signal is sent to the internal controller, which is converted to binary code. The code is matched to a specific character, which is displayed. To represent the Latin alphabet in digital format, an international ASCII coding system was created. It requires 1 byte for writing one character, hence the symbol consists of an eight-digit sequence of zeros and ones. The recording interval is from 00000000 to 11111111, that is, the encoding of textual information using this system allows the presentation of 256 symbols. In most cases this is enough.

ASCII is divided into two parts. The first 127 characters (from 00000000 to 01111111) are international and represent specific characters and letters of the English alphabet. The second part - the extension (from 10,000,000 to 11111111) - is designed to represent the national alphabet, the writing of which is different from Latin.

The encoding of text information in ASCII is built on the principle of increasing sequence, that is, the greater the number of the Latin letter, the greater the value of its ASCII code. Figures and the Russian part of the table are built on the same principle.

However, in the world there are several more types of encoding for Cyrillic letters. The most common ones are KOI-8 (8-bit encoding, used already in the 1970s on the first unified Unix OS), ISO 8859-5 (developed by the International Bureau of Standardization), CP 1251 (text information coding used in Modern Windows OS), as well as 2-byte Unicode encoding, with which you can submit 65,536 characters. Such a variety of encodings is due to the fact that they were developed at different times, for different operating systems and for various reasons. Because of this, there are often difficulties in transferring text from one medium to another - if the encodings do not match, the user will see only a set of incomprehensible icons. How can you fix this situation? In Word, for example, when you open a document, you receive a message about problems with displaying text and offers several options for transcoding.

So, the coding and processing of textual information in the depths of the computer is a rather complicated and time-consuming process. All symbols of any alphabet represent only a certain sequence of digits of the binary system, one cell is one byte of information.

Similar articles

 

 

 

 

Trending Now

 

 

 

 

Newest

Copyright © 2018 en.birmiss.com. Theme powered by WordPress.