Unicode Converter

What is Unicode?

Unicode is a character encoding standard that assigns a unique number (code point) to every character in the world. Each character has a unique Unicode code point, for example:

"A" has code point U+0041
"中" has code point U+4E2D
"😀" has code point U+1F600

Why Do We Need Unicode Encoding/Decoding?

In programming and network transmission, certain characters cannot be used directly and need to be converted to specific formats:

Scenario	Encoding Format	Example
JavaScript strings	`\uXXXX` escape	`"Hello"` → `"\u0048\u0065\u006C\u006C\u006F"`
HTML pages	`&#x...;` entity	`"中"` → `中`
URL addresses	`%XX` percent encoding	`"Hello"` → `%48%65%6C%6C%6F`
CSS content	`\XXXX` escape	`"中"` → `\4E2D`

Usage Guide

Quick Start

Paste content: Paste the text you want to convert in the input box
Auto-detect: The tool automatically detects the input format
View result: Decoded or encoded result is displayed automatically
Copy: Click the "Copy Result" button

Common Use Cases

Case 1: Decode Unicode Escape

When you see a string like \u4F60\u597D and want to know what it means:

Unicode Decode

Case 2: Decode URL Encoding

Chinese characters in URLs are usually encoded as %E4%BD%A0 format:

URL Decode

Case 3: Encode Text to Unicode

When you need to use Chinese in JavaScript:

Unicode Encode

Case 4: Handle Emoji

Support for Emoji encoding and decoding:

Advanced Options

When the input is plain text, click "Encoding Format" to choose different encoding formats:

Unicode Escape: For JavaScript/JSON
HTML Entity: For HTML pages
CSS Escape: For CSS content property
URL Encoded: For URL parameters

FAQ

What is a Unicode escape sequence?

A Unicode escape sequence is a way to represent Unicode characters in programming languages. The format is \u followed by 4 hexadecimal digits, such as \u4F60 representing "你". For characters beyond U+FFFF (like Emoji), surrogate pairs are used, such as \uD83D\uDE00 for 😀.

What's the difference between Unicode and UTF-8?

Unicode is a character set standard that assigns a unique number to each character. UTF-8 is an encoding method for Unicode that converts code points into 1-4 byte sequences. For example, "中" has Unicode code point U+4E2D, and UTF-8 encoding is E4 B8 AD (3 bytes).

Why do %XX appear in URLs?

URLs can only contain ASCII characters. Non-ASCII characters like Chinese need to be percent-encoded. Each byte is represented as %XX, where XX is a two-digit hexadecimal number. For example, "你" has UTF-8 encoding E4 BD A0, which becomes %E4%BD%A0 after URL encoding.

What's the relationship between HTML entities and Unicode?

HTML entities are a way to represent special characters in HTML. You can use &#xcodepoint; or &#decimalcodepoint;. For example, 你 and 你 both represent "你".

How to handle Emoji?

This tool fully supports Emoji processing. Emoji usually have code points beyond U+FFFF and require special handling:

😀 has code point U+1F600
Unicode escape: \uD83D\uDE00 (surrogate pair) or \u{1F600} (ES6)
URL encoding: %F0%9F%98%80

Related Tools

Base64 Encoder/Decoder - Base64 encoding and decoding
URL Encoder/Decoder - URL encoding and decoding
JSON Formatter - JSON formatting and minification

Unicode Converter

Character Inspector

Help

FAQ

Quick Start

Privacy Notice

Full guide