Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • DSA
  • Practice Problems
  • Python
  • C
  • C++
  • Java
  • Courses
  • Machine Learning
  • DevOps
  • Web Development
  • System Design
  • Aptitude
  • Projects
Open In App
Next Article:
What is utf 8 in HTML
Next article icon

What is Unicode?

Last Updated : 15 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Unicode is a universal character encoding standard designed to represent text and symbols from all writing systems around the world. Unicode is the most fundamental and universal character encoding standard. For every character, there is a unique 4 to 6-digit unique hexadecimal number known as a Unicode point. Unicode is standardized among all computing platforms, enabling consistent representation and manipulation of text across different systems and applications.

By assigning a unique code to every character, regardless of platform, program, or language, Unicode facilitates consistent text representation and data interchange across different systems. This global standard supports the seamless integration and communication of diverse languages and scripts, making it essential in our increasingly interconnected digital world

Table of Content

  • What is Unicode?
  • Key Features of Unicode
  • History of Unicode
  • Size and Growth
  • How To Type in Unicode Characters?
  • Unicode Transformation Format (UTF)

What is Unicode?

Unicode is a universal character encoding standard that assigns a unique code to every character, symbol, and script used in writing systems around the world making all characters available across all platforms, programs, and devices. It ensures that text is consistently represented and understood across different platforms, programs, and devices, enabling seamless communication and data exchange globally .

Key Features of Unicode

  • Universal Coverage: Unicode aims to encode all the characters humans use for writing, including letters, symbols, punctuations, emojis, mathematical symbols, etc.
  • Unique Code: Each character in Unicode has a unique 4 to 6-digit hexadecimal number. For Example, the letter 'A' has the code 0041, represented as U+0041.
  • Compatible with ASCII:
    • Unicode is compatible with ASCII encoding. This means that the first 128 characters in Unicode directly correspond to the characters represented in the 7-bit ASCII table
    • We can also say that ASCII is a subset of Unicode.
    • But wait! For the character 'A', the ASCII representation is 0065 and the unicode point is U+0041. How is it backward compatible with ASCII?
    • This is because the U+0041 is in hexadecimal form! which corresponds to 0065 in Decimal.
    • (0041)16 = (0065)10
  • Flexibility: Unicode is flexible. It allows new characters to be added, supporting the evolving communication and language needs.

History of Unicode

Before the development of Unicode, there were hundreds of different character encodings for assigning letters and other characters to numbers so that computers could read them. Because of its limitations, this system was unable to encode enough characters to cover all of the world's languages, as well as hold all letters, punctuation, and technical systems in regular use.

Conflicts between character encodings also meant that two encodings could use the same number to represent two different characters or even multiple numbers for the same character. Any computer would have to handle various encodings, and this arrangement increased the possibility of data corruption as data moved between different computers or encodings.

Versions of Unicode

There have been numerous versions of Unicode released till now :

Unicode VersionYear of ReleaseMonth (Day)
15.1.02023September 12
15.0.02022September 13
14.0.02021September 14
13.0.02020March 10
12.1.02019May 7
12.0.02019March 5
11.0.02018June 5
10.0.02017June 20
9.0.02016June 21
8.0.02015June 17
7.0.02014June 16
6.3.02013September 30
6.2.02012September 26
6.1.02012January 31
6.0.02010October 11
5.2.02009October 1
5.1.02008April 4
5.0.02006July 14
4.1.02005March 31
4.0.12004March
4.0.02003April
3.2.02002March
3.1.12001August
3.1.02001March
3.0.12000August
3.0.01999September
2.1.91999April
2.1.81998December
2.1.51998August
2.1.21998May
2.0.01996July
1.1.51995July
1.1.01993June
1.0.11992June
1.0.01991October

Size and Growth

As of today, Unicode supports over 1,49,000 characters! This set continues to grow to accommodate new symbols, emojis, and characters. Here are some characters with their Unicodes:

Character

Unicode

😊

U+1F60A

👍

U+1F44

1

U+0031

+

U+002B

How To Type in Unicode Characters?

  • Open your computer and log into your Operating System.
  • Opening unicode window.
    • On a Windows machine press the Windows Key (🪟) + period key (Dot key).
    • On Mac OS press Control + command + space
  • This will open a small window with Unicode characters.
  • Search for the character you want and click on it. The character will appear on the screen.

Unicode Transformation Format (UTF)

Unicode Transformation Format is a method of encoding unicode characters for storage and communication purposes. This format specifies how Unicode characters will be converted into a sequence of bytes. The most common UTF forms are UTF-8, UTF-16, UTF-32.

UTF-8

  • UTF-8 is a variable width encoding system where each character is encoded into 1 to 4-byte unicode points.
  • UTF-8 is backward compatible with ASCII. All the ASCII characters (0-127) and 10 are represented inside UTF-8 (00-F7)16 using one byte.
  • Other Unicode characters in UTF-8 are represented using multiple bytes.
  • UTF-8 is widely used in internet and UNIX-like operating systems.

UTF-16

  • UTF-16 is also a variable width encoding system where each character is encoded into a 2 to 4-byte unicode point.
  • UTF-16 is used in Microsoft Windows OS and programming languages like Java

UTF-32

  • UTF-32 is a fixed-width encoding system where each character is encoded into 4-byte unicode point.
  • This format provides a simple one-to-one correspondence between Unicode characters but makes it less space-efficient, as where it should only take 1 byte of data (Example: 01), it is taking up 4 bytes (Example: 00000001).
  • UTF-32 is less commonly used in mainstream applications and systems due to its space inefficiency and compatibility considerations

Conclusion

Unicode stands as a crucial pillar in the realm of digital communication, bridging the gap between diverse languages and scripts. By providing a standardized and unique way to represent text, Unicode ensures that information can be accurately and consistently shared across different platforms and devices. This universality fosters global connectivity, supports multilingual content, and underpins the seamless operation of today's technology-driven world. As digital communication continues to evolve, Unicode's role in maintaining clarity and consistency in textual representation remains indispensable.

What-is-Unicode



Next Article
What is utf 8 in HTML

I

ishaanbhela
Improve
Article Tags :
  • Computer Subject

Similar Reads

  • ASCII Vs UNICODE
    Overview :Unicode and ASCII are the most popular character encoding standards that are currently being used all over the world. Unicode is the universal character encoding used to process, store and facilitate the interchange of text data in any language while ASCII is used for the representation of
    3 min read
  • What is Bit?
    Bit have high importance in computer memory, networks because computer understand the things, codes even multimedia in form of bits. Not in the language we communicate How do they communicate with each other and with us? How do they represent text, images, sounds, and videos? The answer to all these
    7 min read
  • Working with Unicode in Python
    Unicode serves as the global standard for character encoding, ensuring uniform text representation across diverse computing environments. Python, a widely used programming language, adopts the Unicode Standard for its strings, facilitating internationalization in software development. This tutorial
    3 min read
  • What is utf 8 in HTML
    When creating websites and web applications, one important can ensuring that content displays correctly for users around the world. Text encoding can play a critical role in this, as it defines how characters are represented in the digital form. UTF-8 (Unicode Transformation Format 8-bit) is one of
    4 min read
  • CSS unicode-bidi Property
    The unicode-bidi property in HTML DOM is applied along-with the direction property to determine how the bidirectional text is handled in a document. Syntax: unicode-bidi: normal|embed|bidi-override|isolate|isolate-override|plaintext|initial|inherit; Property Values: 1. normal: It is the default valu
    3 min read
  • unicode_literals in Python
    Unicode is also called Universal Character set. ASCII uses 8 bits(1 byte) to represents a character and can have a maximum of 256 (2^8) distinct combinations. The issue with the ASCII is that it can only support the English language but what if we want to use another language like Hindi, Russian, Ch
    3 min read
  • What is Character Encoding System?
    As we all know, computers do not understand the English alphabet, numbers except 0 and 1, or text symbols. We use encoding to convert these. So, encoding is the method or process of converting a series of characters, i.e, letters, numbers, punctuation, and symbols into a special or unique format for
    5 min read
  • Unicode to ASCII Converter
    Unicode to ASCII Converter is a tool that transforms Unicode-encoded text into ASCII, providing a simplified character set. It aids compatibility and representation, allowing users to convert text between different encoding schemes, ensuring broader compatibility across systems and applications. ifr
    2 min read
  • Microsoft IIS Unicode Exploits
    Unicode is a superset of the Latin, Greek, and other character sets that were previously used on the Internet. Unicode includes more characters than the other character sets, but it also includes unique characters not found in the other character sets. Unicode also includes punctuation, mathematical
    4 min read
  • HTML URL Encoding
    A Uniform Resource Locator (URL) is simply the address of a website to access the website content. Web browsers retrieve pages from web servers using a URL (Uniform Resource Locator). What is URL Encoding?URL Encoding is the process of converting the URL into a valid format that is accepted by web b
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences