From the LDC Language Resource Wiki

Revision as of 13:36, 8 May 2011 by Mamandel (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Home > Panjabi



(Eastern Panjabi, Gurmukhi)



This document pertains primarily to Eastern Panjabi (Gurmukhi). There is some material on Western Panjabi as well.


Eastern Panjabi

(Information from Ethnologue, 2009-05-13)

  • ISO 639-3 code: pan
  • Spoken in: India: Punjab, Majhi in Gurdaspur and Amritsar districts, Bhatyiana in South Firozpur District; Rajasthan, Bhatyiana in north Ganganagar District; Haryana; Delhi; Jammu and Kashmir. Also spoken in Bangladesh and diaspora.
  • Population: 27,109,000 in India
  • Alternate names: Punjabi, Gurmukhi, Gurumukhi
  • Dialects: Panjabi Proper, Majhi, Doab, Bhatyiana (Bhatneri, Bhatti), Powadhi, Malwa, Bathi. Western Panjabi is distinct from Eastern Panjabi, although there is a chain of dialects to Western Hindi (Urdu).
  • Classification: Indo-European, Indo-Iranian, Indo-Aryan, Central zone, Panjabi
  • Script: Gur(u)mukhi and Devanagari

Western Panjabi

Information from Ethnologue, 2009-05-13

  • ISO 639-3 code: pnb
  • Spoken in: Mainly in the Punjab area of Pakistan.
  • Population: 60,647,207 in Pakistan (2000 WCD).
  • Alternate names: Western Punjabi, Lahnda, Lahanda, Lahndi
  • Dialects: There is a continuum of varieties between Eastern and Western Panjabi, and with Western Hindi and Urdu. 'Lahnda' is a name given earlier for Western Panjabi; an attempt to cover the dialect continuum between Hindko, Pahari-Potwari, and Western Panjabi in the north and Sindhi in the south.
  • Classification: Indo-European, Indo-Iranian, Indo-Aryan, Northwestern zone, Lahnda
  • Script: Perso-Arabic

Linguistic notes


Eastern Panjabi is usually written with the Brahmi-derived Gurmukhi script, and sometimes, especially by Hindus, with Devanagari. Western Panjabi is usually written in Shahmukhi, a variant of the Arabic writing system very similar to the writing system of Urdu.

Linguistic resources




Although not all Panjabi speakers are Sikhs, Sikh given names are generally not gender-specific, and most of these sites do not distinguish names by sex. [Mamandel 12:48, 8 May 2011 (UTC)]

  • 5abi: Gurbani encoding.
  • Babynology: List of Panjabi baby names in Roman transliteration. (Each name appears twice, once for each sex, since the site's general format for all languages is to list male and female names separately.)
  • IQAP Guidance on Ethnic Naming Conventions: Appendix 5 – Sikh Naming Convention (pp. 23-24). U.K. National Health Service. Discussion of naming conventions, and about 50 personal names and 45 subcaste names, in transliteration.
  • Sikh Names. Transliteration, with meanings.
  • Sikhiwiki: Sikh naming convention. Discussion of naming convention including gender-neutrality, but separate lists of male (38) and female (21) names in Roman transliteration. All the male names are "___ Singh" and the female names "___ Kaur", the traditional gender-based surnames (called "Complementary religious name" in "IQAP Guidance on Ethnic Naming Conventions), but the first names are almost completely distinct (only "Davinder" appears in both).
  • Sushmajee: About 1000 names. Transliteration.

Linguistic portals and bibliographies

Encoding and Fonts

Before the development and general use of Unicode, computer use of Panjabi and other South Asian languages required special fonts using only one byte. Many of these fonts were specific to one website or another and used idiosyncratic encodings. To some extent that is still the case; and so this page includes some such sites (see News), and some resources for specific fonts and encoding converters.



The Unicode range for Gurmukhi is 0A00-0A7F.


The Bureau of Indian Standards supports its own encoding standard. See ISCII.


An 8-bit encoding used by a number of sites. Most of the script is in the lower half. Fonts available from SikhNet below.


An 8-bit encoding, with the script in the upper half; x20-x7f is used for punctuation, special characters, digits (both Western and Panjabi), and so on. Used by Daily Ajit, Awandhar. Font available from Sikh Students Federation below.



  • GUCA: Gurmukhi Unicode Conversion Application. GNU GPL. Requires Microsoft .NET Framework. Converts ASCII encoded, font-based Gurmukhi text based on Dr. Thind's fonts (e.g. AnmolLipi, GurbaniLipi fonts) into Unicode. Also includes a custom mapping engine to add encodings. -- Although the site for "Dr. Thind's fonts" now uses Unicode, many other sites still use these 8-bit encodings. See SikhNet, above.
  • Unicodify: From Lancaster University, producers of the Emille corpus. For Windows; source code available.


  • Indian Language Converter. Type in Roman characters according to the Gurmukhi character chart on the page and get Gurmukhi text and HTML. On-web or download with GNU GPL. E.g.:
    Roman input: guramukhee
    Gurmukhi output: ਗੁਰਮੁਖੀ
    HTML output: &#2583;&#2625;&#2608;&#2606;&#2625;&#2582;&#2624;<br/>

Data Sources

Monolingual Text



Parallel Text

Civic information and advice

  • Choose and Book: Introduction. A [UK] national electronic referral service which gives patients a choice of place, date and time for their first outpatient appointment in a hospital or clinic. PDF, Gurbani encoding.
  • EMILLE corpus. 200,000 words of text in English (information leaflets from the UK Government and various local authorities) with Eastern Panjabi translation. Free license for non-profit research use.
  • Law Society of England and Wales. Thirteen guides to common legal problems, parallel in English and about 16 other languages. (Two more guides promise other languages available soon.) [2009-06-23]


  • GUCA. Panjabi computing resource website that is parallel English and Panjabi

Religious: Christian

  • Bible. links for print editions.
  • Religious Passages from Cloverdale Bibleway Church: Nine sermons of William Marrion Branham. Apparently a non-Gurbani Latin encoding. PDF 350-645 kB (mean 431), est. 150k words. Paginated printing and binding, 2-up and 2-sided [pp. 0+1,2+23; 22+3,4+21; 20+5,6+19...]. Panjabi, English

Religious: Sikh

  • Guru Granth Sahib. Sikh holy texts, word lists, concordances, interlinear translations. Panjabi (Gurmukhi, Shahmukhi, Devanagari) and English. Some files Unicode, but some Gurbani encoding.
  • Punjabi Online: Mool Mantar. Religious text. Interlinear, with Panjabi in Gurbani encoding, grouped as (Panjabi1, transliteration1, Panjabi2, English). Panjabi2, may be commentary on Panjabi1, and only the commentary translated.
  • Sridasam. Religious text. 2326 pages. Unicode. Parallel text verse by verse: Panjabi and Hindi apparently complete, English translation only through p. 1466. [2009-07-23]


  • APNA Channel is a satellite channel broadcasting from Thailand, and is envisaged as a news channel telecasting in Punjabi language, internationally footage to be in 127 countries. (Website appears to be mostly in English. News is all English text feeds. Video programming is apparently Panjabi with Shahmukhi titles.) [2009-07-23]
  • Awaaz-E-Watan (voice of the homeland). 1 hour per week. Cable channels in Fresno, CA and vicinity. [2009-07-23]
  • MH1: music. MH1 also plans to introduce a medical channel. The MH1, the company`s maiden venture, would be linked from India. The channel would primarily cater to Punjab. [Accessed 2011-05-08. Last update 2011-01-21. This plan is basically the same as that described in the update of 2009-01-19.]
  • Punjab Today: the leading 24-hour Punjabi news channel... It offers varied programming content such as Bollywood gossip, politics, and heritage and culture.[2009-07-23]


Personal tools