Skip to content

As a developer, have you ever been confused by these numerous codes?

  • When users register, should the country list show CN or CHN?
  • For multilingual translation (i18n), should folders be named zh or zh-CN?
  • When handling video subtitles, standards require unfamiliar three-letter codes—sometimes zho, other times chi—what's the difference?
  • And what about seemingly random time zone identifiers like Asia/Shanghai?

After reading this, you'll fully understand the logic behind these codes and confidently use them correctly in your projects.

Core Idea: Divide and Conquer

These standards seem confusing because we try to understand them with a vague concept of "region." But the principle in the computer world is precision. Therefore, international standards organizations "divide and conquer" the fuzzy concept of "region," breaking it down into specific, orthogonal (mutually independent) dimensions, each with its own gold standard.

Our journey begins by understanding these dimensions.


1. Geographic Location: Where Am I? - ISO 3166-1

This is the foundation of all codes, answering the simplest question: "What is this country/region?"

  • Standard Name: ISO 3166-1
  • Core Mission: Provide unique identifiers for countries and regions worldwide.
  • Main Formats:
    • alpha-2 (two-letter code): e.g., US, CN, JP. This is the most common and universal form.
    • alpha-3 (three-letter code): e.g., USA, CHN, JPN. More readable, often used in data statistics and official documents.

Developer's Practical Guide:

  • Database Design: When storing countries in a user table, create a country_code field using CHAR(2) type to store the two-letter code (alpha-2). For example:
sql
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(255),
    country_code CHAR(2)
);
  • API Design: Region-related APIs (e.g., e-commerce shipping ranges) should use two-letter codes as parameters, for example:
http
GET /api/v1/shipping?country=CN HTTP/1.1
  • Frontend Development: In country selection dropdowns, the value in <option value="CN">China</option> should use the two-letter code. For example:
html
<select name="country">
    <option value="CN">China</option>
    <option value="US">United States</option>
    <option value="JP">Japan</option>
</select>

Learn More


2. Language: What Language Do I Speak? - ISO 639

This standard focuses solely on one thing: which language are we using?

  • Standard Name: ISO 639
  • Core Mission: Encode the world's languages.
  • Main Formats:
    • ISO 639-1 (two-letter code): e.g., en, zh, ja. It covers about 184 major world languages, conventionally in lowercase.
    • ISO 639-2 (three-letter code, divided into T and B categories): e.g., eng, zho, jpn. It covers over 500 languages, addressing the limitations of two-letter codes.
    • ISO 639-3 (three-letter code): e.g., eng, zho, jpn. ISO 639-3 is an extension of ISO 639-2, aiming to cover all individual languages as a superset.

Learn More


3. Precise Localization: Where Am I and What Language Do I Speak? - Locale

Now, we combine the previous two to answer a more precise question: "What specific language is the user using in a particular region?" This is the concept of Locale.

  • Standard Name: No single standard, typically follows the IETF BCP 47 specification, combining ISO 639 and ISO 3166-1.
  • Core Mission: Precisely describe language variants in specific regions to handle differences in spelling, vocabulary, date formats, currency symbols, etc.
  • Format: language-COUNTRY
    • en-US: English used in the United States.
    • en-GB: English used in the United Kingdom.
    • zh-CN: Chinese used in Mainland China (specifically Simplified Chinese).
    • zh-TW: Chinese used in Taiwan (specifically Traditional Chinese).

Developer's Practical Guide:

  • Software Internationalization (i18n): Your resource files (e.g., translation strings) should be placed in folders named by Locale, such as values-zh-CN/strings.xml (Android). For example:
res/
    values/
        strings.xml
    values-zh-CN/
        strings.xml
  • HTTP Request Headers: Parse the Accept-Language: zh-CN,zh;q=0.9 header to return the most appropriate language version for the user. For example:
http
Accept-Language: zh-CN,zh;q=0.9
  • Date/Currency Formatting: All modern programming language libraries accept Locale as a parameter. For example, in Java:
java
Locale locale = new Locale("zh", "CN");
DateFormat dateFormat = DateFormat.getDateInstance(DateFormat.DEFAULT, locale);
String dateStr = dateFormat.format(new Date());

Learn More


4. Specialized Fields & Edge Cases: Subtitles, Multimedia & T/B Codes - ISO 639-2

Why don't video subtitles simply use zh or en? Because specialized fields require broader language coverage, and this is also the source of the "one language, multiple codes" issue.

  • Standard Name: ISO 639-2 (three-letter code)

  • Key Concept: T/B Codes (Terminology/Bibliographic Codes) About 20+ languages have two three-letter codes in ISO 639-2 due to historical reasons:

    • B Code (Bibliographic): Derived from the English name, mainly used in library cataloging, a historical legacy. For example, German -> ger.
    • T Code (Terminology): Derived from the language's native name, recommended for use in modern computer applications. For example, Deutsch -> deu.

    The most common example is Chinese:

    • chi is the B code (from Chinese).
    • zho is the T code (from 中文, Zhōngwén).
LanguageEnglish NameNative NameB Code (Old/Cataloging)T Code (New/Terminology)Recommended Use
ChineseChinese中文chizhozho
GermanGermanDeutschgerdeudeu
FrenchFrenchFrançaisfrefrafra
TibetanTibetanབོད་ཡིགtibbodbod

Developer's Practical Guide:

  • Golden Rule: Prioritize T codes! They are designed for technical applications. But when dealing with legacy systems or external data, your code needs compatibility to recognize both T and B codes.
  • Media Processing: Use T codes with FFmpeg. For example:
bash
ffmpeg -i input.mp4 -metadata:s:s:0 language=zho output.mp4
  • Data Cleaning: When receiving data from external sources, use a mapping function to standardize codes. For example, in Python:
python
language_map = {
    "chi": "zho",
    "ger": "deu",
    "fre": "fra",
    "tib": "bod",
}

def normalize_language_code(code):
    return language_map.get(code, code)

5. The Ultimate Challenge: Time and Time Zones - IANA Time Zone Database

Why can't we use the country code US to represent U.S. time? Because the continental U.S. has 4 time zones and involves complex daylight saving time rules.

  • Standard Name: IANA Time Zone Database (also known as tz database or Olson database)
  • Core Mission: Precisely define the boundaries of all global time zones, their offsets from UTC, and all historical daylight saving time change rules.
  • Format: Continent/Representative_City
    • Asia/Shanghai
    • America/New_York
    • Europe/London

Developer's Practical Guide:

  • Golden Rule: Never calculate time zones or daylight saving time yourself!
  • Backend Development: On the server, all times should be stored in UTC, and converted to local time using IANA identifiers. For example, in Java:
java
Instant instant = Instant.now();
String timestamp = instant.toString();
  • Frontend Development: Browser APIs can retrieve the user's time zone. For example, in JavaScript:
javascript
const timeZone = Intl.DateTimeFormat().resolvedOptions().timeZone;

Learn More


Quick Reference Cheat Sheet

ScenarioWhat Do I Need?Standard UsedExample CodeKey Developer Insight
Select CountryUnique country IDISO 3166-1 alpha-2CN, USStore in CHAR(2) in DB, use in API params
Web Page or Simple TranslationIdentify a major languageISO 639-1zh, enHTML lang attribute, basic i18n
Precise LocalizationDistinguish regional language variantsIETF BCP 47zh-CN, en-USi18n folder naming, HTTP headers, formatting
Subtitle/Audio Track LabelingCover as many languages as possibleISO 639-2zho (recommended)Prefer T codes, be compatible with B codes
Handle Local TimePrecisely calculate time and DSTIANA Time Zone DBAsia/ShanghaiStore UTC on server, convert using IANA IDs on client

Now, the fog has cleared. These codes are not a product of chaos but a well-designed, clearly divided system. Mastering them will enable you to:

  1. Build a clear mental model: Understand the appropriate scenarios for each code and the historical reasons behind edge cases like zho/chi.
  2. Write more robust code: Elegantly handle global user needs while maintaining compatibility with legacy data.
  3. Collaborate efficiently: Communicate precisely with your team using accurate terminology.