Unicode Converter

Character Inspector

No data
  1. Paste the text you want to convert in the input box
  2. The tool automatically detects the input format (Unicode escape, HTML entity, URL encoding, etc.)
  3. Decoded or encoded result is displayed automatically
  4. Click "Copy Result" button to use

Help

FAQ

What is a Unicode escape sequence?Show
A Unicode escape sequence is a way to represent Unicode characters in programming languages. The format is \u followed by 4 hexadecimal digits, such as \u4F60 representing "你". For characters beyond U+FFFF (like Emoji), surrogate pairs are used.
What's the difference between Unicode and UTF-8?Show
Unicode is a character set standard that assigns a unique number to each character. UTF-8 is an encoding method for Unicode that converts code points into 1-4 byte sequences. For example, "中" has Unicode code point U+4E2D, and UTF-8 encoding is E4 B8 AD (3 bytes).
Why do %XX appear in URLs?Show
URLs can only contain ASCII characters. Non-ASCII characters like Chinese need to be percent-encoded. Each byte is represented as %XX, where XX is a two-digit hexadecimal number. For example, "你" has UTF-8 encoding E4 BD A0, which becomes %E4%BD%A0 after URL encoding.
What's the relationship between HTML entities and Unicode?Show
HTML entities are a way to represent special characters in HTML. You can use &#xcodepoint; or &#decimalcodepoint;. For example, 你 and 你 both represent "你".
How to handle Emoji?Show
This tool fully supports Emoji processing. Emoji usually have code points beyond U+FFFF and require special handling. 😀 has code point U+1F600, Unicode escape is \uD83D\uDE00 (surrogate pair) or \u{1F600} (ES6).

Quick Start

  1. Paste the text you want to convert in the input box
  2. The tool automatically detects the input format (Unicode escape, HTML entity, URL encoding, etc.)
  3. Decoded or encoded result is displayed automatically
  4. Click "Copy Result" button to use

Privacy Notice

Full guide

Unicode Converter

What is Unicode?

Unicode is a character encoding standard that assigns a unique number (code point) to every character in the world. Each character has a unique Unicode code point, for example:

  • "A" has code point U+0041
  • "中" has code point U+4E2D
  • "😀" has code point U+1F600

Why Do We Need Unicode Encoding/Decoding?

In programming and network transmission, certain characters cannot be used directly and need to be converted to specific formats:

ScenarioEncoding FormatExample
JavaScript strings\uXXXX escape"Hello""\u0048\u0065\u006C\u006C\u006F"
HTML pages&#x...; entity"中"中
URL addresses%XX percent encoding"Hello"%48%65%6C%6C%6F
CSS content\XXXX escape"中"\4E2D

Usage Guide

Quick Start

  1. Paste content: Paste the text you want to convert in the input box
  2. Auto-detect: The tool automatically detects the input format
  3. View result: Decoded or encoded result is displayed automatically
  4. Copy: Click the "Copy Result" button

Common Use Cases

Case 1: Decode Unicode Escape

When you see a string like \u4F60\u597D and want to know what it means:

Unicode Decode

Case 2: Decode URL Encoding

Chinese characters in URLs are usually encoded as %E4%BD%A0 format:

URL Decode

Case 3: Encode Text to Unicode

When you need to use Chinese in JavaScript:

Unicode Encode

Case 4: Handle Emoji

Support for Emoji encoding and decoding:

Emoji Encode

Advanced Options

When the input is plain text, click "Encoding Format" to choose different encoding formats:

  • Unicode Escape: For JavaScript/JSON
  • HTML Entity: For HTML pages
  • CSS Escape: For CSS content property
  • URL Encoded: For URL parameters

FAQ

What is a Unicode escape sequence?

A Unicode escape sequence is a way to represent Unicode characters in programming languages. The format is \u followed by 4 hexadecimal digits, such as \u4F60 representing "你". For characters beyond U+FFFF (like Emoji), surrogate pairs are used, such as \uD83D\uDE00 for 😀.

What's the difference between Unicode and UTF-8?

Unicode is a character set standard that assigns a unique number to each character. UTF-8 is an encoding method for Unicode that converts code points into 1-4 byte sequences. For example, "中" has Unicode code point U+4E2D, and UTF-8 encoding is E4 B8 AD (3 bytes).

Why do %XX appear in URLs?

URLs can only contain ASCII characters. Non-ASCII characters like Chinese need to be percent-encoded. Each byte is represented as %XX, where XX is a two-digit hexadecimal number. For example, "你" has UTF-8 encoding E4 BD A0, which becomes %E4%BD%A0 after URL encoding.

What's the relationship between HTML entities and Unicode?

HTML entities are a way to represent special characters in HTML. You can use &#xcodepoint; or &#decimalcodepoint;. For example, 你 and 你 both represent "你".

How to handle Emoji?

This tool fully supports Emoji processing. Emoji usually have code points beyond U+FFFF and require special handling:

  • 😀 has code point U+1F600
  • Unicode escape: \uD83D\uDE00 (surrogate pair) or \u{1F600} (ES6)
  • URL encoding: %F0%9F%98%80

Related Tools