Text to Binary Converter — ASCII Encoding
Quick Answer
Text-to-binary conversion turns each character into its 8-bit ASCII binary representation: 'A' = 01000001, 'B' = 01000010. The ASCII standard (ANSI X3.4-1986) defines 128 characters in 7 bits, extended to 8 bits for Latin-1.
Also searched as: ascii to binary, binary translator, text encoder, binary to english
Decimal (codepoints)
—
Hex
—
How Text to Binary Conversion Works
Converting text to binary is a two-step process: first map each character to its numeric codepoint using an encoding standard, then convert that number to its base-2 representation. For English letters and punctuation, the mapping comes from ASCII, which assigns uppercase A to 65, uppercase B to 66, the digit 0 to 48, space to 32, and so on, as tabulated in the original 1963 ANSI X3.4 standard. Converting 65 to binary gives 1000001, and padding to 8 bits gives 01000001. For characters outside the ASCII range, the tool uses UTF-8 encoding, which assigns one to four 8-bit bytes per character per RFC 3629. Reversing the process is just as simple: split the binary string into 8-bit groups, interpret each group as a number in base 2, and look up the corresponding character. For related tools see our hex to decimal converter and binary calculator.
The Conversion Formula
For a character c with codepoint n, the binary representation is produced by dividing n by 2 repeatedly and reading the remainders from bottom to top: n = q_1 * 2 + r_1, q_1 = q_2 * 2 + r_2, and so on until the quotient is zero. The binary string is r_k r_{k-1} ... r_1, left-padded with zeros to the target width. For example, the letter H is codepoint 72; dividing gives 72 = 36*2+0, 36 = 18*2+0, 18 = 9*2+0, 9 = 4*2+1, 4 = 2*2+0, 2 = 1*2+0, 1 = 0*2+1, reading the remainders top to bottom gives 1001000, and padding to 8 bits gives 01001000. The word "Hi" becomes 01001000 01101001, which is the 8-bit binary for codepoint 72 followed by codepoint 105. In JavaScript, the one-line conversion is text.split('').map(c => c.charCodeAt(0).toString(2).padStart(8, '0')).join(' '), which is exactly what this tool runs internally.
Key Terms You Should Know
Bit: a single binary digit, 0 or 1; the smallest unit of data. Byte: a group of 8 bits that can represent 256 distinct values (0 to 255). ASCII: the 7-bit encoding standard for 128 English-language characters, published as ANSI X3.4-1986. Unicode: the universal character set with over 149,000 code points covering nearly every written language. UTF-8: the variable-width encoding of Unicode that uses 1 to 4 bytes per character and is backward-compatible with ASCII. Codepoint: the integer value assigned to a character in a given encoding standard. Big-endian / little-endian: the two ways of ordering bytes within a multi-byte integer; UTF-8 is endian-free because it is a stream of bytes. Control character: an ASCII character in the range 0 to 31 that represents commands like tab, newline, or null terminator rather than printable text.
ASCII Reference Table
The table below shows common ASCII characters with their decimal, hex, and 8-bit binary representations from the Unicode Consortium's Basic Latin block. These values have been unchanged since 1967 and form the foundation of nearly every text file, source code file, and network protocol ever written. Note that lowercase letters are exactly 32 higher than their uppercase counterparts (0x20 difference), which makes case conversion a simple bit flip.
| Char | Dec | Hex | Binary (8-bit) |
|---|---|---|---|
| Space | 32 | 0x20 | 00100000 |
| 0 | 48 | 0x30 | 00110000 |
| 9 | 57 | 0x39 | 00111001 |
| A | 65 | 0x41 | 01000001 |
| Z | 90 | 0x5A | 01011010 |
| a | 97 | 0x61 | 01100001 |
| z | 122 | 0x7A | 01111010 |
| Newline | 10 | 0x0A | 00001010 |
Practical Examples
Example 1 — Encoding the word "Hello": H is 01001000, e is 01100101, l is 01101100, l is 01101100, o is 01101111, which concatenates to 01001000 01100101 01101100 01101100 01101111 in space-separated form. The total length is 5 characters times 8 bits equals 40 bits or 5 bytes. Example 2 — Decoding a binary message: The string 01010111 01101001 01101011 01101001 decodes byte by byte to 87, 105, 107, 105, which maps to W, i, k, i, spelling "Wiki". Example 3 — Measuring file size: A plain-text file containing the word "test" (4 characters) is 4 bytes or 32 bits on disk when stored as ASCII or UTF-8, but the same word stored as UTF-16 would be 8 bytes because UTF-16 uses at least 2 bytes per character. Most modern systems default to UTF-8 because it is the most space-efficient for Western text and still supports the full Unicode range.
Tips and Best Practices
Use 8 bits per byte: 7-bit mode only works for plain ASCII and breaks on any accented or non-English character; stick to 8-bit unless you specifically need legacy behavior. Include spaces as separators: space-separated binary is much easier to read and debug than an unbroken stream of 1s and 0s. Watch for whitespace when decoding: binary-to-text will fail if the input has extra characters or mixed separators; this tool normalizes whitespace automatically. Handle UTF-8 multi-byte characters correctly: emoji and non-Latin scripts require 2 to 4 bytes per character, and all bytes must be decoded as a group. Do not confuse binary with other bases: 01000001 is binary for 65, but 41 is hex for 65; keep the base explicit. For cryptographic use, prefer hex: binary strings are unwieldy for keys and hashes; hex is the standard in security contexts. Use this tool for learning, not encryption: converting text to binary is not encryption; it is a reversible encoding anyone can undo.
Frequently Asked Questions
How do I convert text to binary?
To convert text to binary, look up each character's ASCII code and write that number in base 2, padding to 8 bits. For example, the letter A is ASCII 65, which is 01000001 in binary. Then concatenate the 8-bit groups, optionally separated by spaces. The letter B is 66 or 01000010, and the word AB becomes 01000001 01000010. This tool automates the entire process for any string of text, including UTF-8 characters beyond the ASCII range, which it encodes using one to four bytes per character according to the Unicode standard.
What is ASCII encoding?
ASCII (American Standard Code for Information Interchange) is a character encoding standard published in 1963 that assigns a unique 7-bit number from 0 to 127 to 128 characters, including uppercase and lowercase English letters, digits, punctuation, and control codes. The standard was defined in ANSI X3.4-1986, which is the official American National Standard for Coded Character Sets. Most modern systems use an 8-bit extension (Extended ASCII) or UTF-8, which is backward-compatible with 7-bit ASCII so that plain English text encodes identically in both.
What is the binary code for A?
The binary code for uppercase A is 01000001, which equals decimal 65 and hex 0x41 in the ASCII table. Lowercase a is 01100001 or decimal 97. The difference of 32 (0x20, the sixth bit) between uppercase and lowercase letters is the well-known case bit that lets programmers flip case with a single XOR operation. The full uppercase alphabet runs from 01000001 (A, 65) to 01011010 (Z, 90), and the lowercase alphabet runs from 01100001 (a, 97) to 01111010 (z, 122), leaving room for punctuation between them.
What is the difference between 7-bit and 8-bit ASCII?
7-bit ASCII, the original 1963 standard, defines 128 characters (0 through 127) in a single 7-bit byte, with the 8th bit historically used for parity checking on serial lines. 8-bit ASCII or Extended ASCII uses the full byte to encode 256 characters (0 through 255), adding accented Latin letters, line-drawing characters, and symbols in the upper range 128 through 255. There are multiple 8-bit ASCII variants (IBM Code Page 437, ISO-8859-1, Windows-1252), which is one reason UTF-8 replaced them as the web's dominant encoding after 2000.
Can binary represent non-English characters?
Yes, through UTF-8 encoding, which represents every character in the Unicode standard as one to four 8-bit bytes. ASCII characters still use a single byte starting with 0, while non-ASCII characters start with multi-byte sequences identified by leading bit patterns 110, 1110, or 11110. The Euro sign is the three-byte sequence 11100010 10000010 10101100 (E2 82 AC), and the emoji smiley face is the four-byte sequence F0 9F 98 80. UTF-8 powers 98.2 percent of all websites as of 2024 according to W3Techs and is the only encoding required by modern web standards.
How many bits are in one character?
A single ASCII character takes 8 bits (1 byte) in nearly all modern systems, even though the original ASCII standard only needed 7 bits for its 128 defined values. The 8th bit is either unused (set to zero) or used for parity checking in legacy applications. In UTF-8 encoding, non-ASCII characters take 16, 24, or 32 bits (2 to 4 bytes) because Unicode needs over 143,000 code points. Fixed-width UTF-32 always uses 32 bits per character but wastes space on Latin text, which is why variable-width UTF-8 is the dominant choice today.