If I need to check a string has unique characters, I understand if we are considering characters in Ascii table, then there will 128 of them.
However, why do we need to make a boolean array of size 256 to hold 128 characters to check if element existed at least once in a string? Shouldn't a boolean array of size 128 sufficient?
Here's a quote from from the book 'Cracking the Coding Interview':
...
Basically, we use only 128 total character which is used mostly during program.But total number of Character in ASCII table is 256 (0 to 255). 0 to 31(total 32 character ) is called as ASCII control characters (character code 0-31).32 to 127 character is called as ASCII printable characters (character code 32-127). 128 to 255 is called as The extended ASCII codes (character code 128-255).
check reference: http://www.ascii-code.com/
Most of the extended ASCII character isn't present in the QWERTY (ENGLISH) keyboard, so this is the reason, author took 128 total character in that example in 'Cracking the coding interview' book.
No, there are 256 ASCII characters. This includes standard ASCII characters(0-127) and Extended ASCII characters(128-255).
For More Info. Please refer to:http://www.flexcomm.com/library/ASCII256.htm
Many people these days use the term 'ASCII' in a sloppy fashion to describe ISO-8859-1 (also known as Latin-1), a character set that includes the [32 . 126] printable-character values in the old-timey ASCII character set and also values in the range [128.255]. Latin-1 does a reasonably good job of covering Western European languages, whereas ASCII is limited to the non-accented characters used in basic English.
ASCII also includes control characters in the range [0-31] and 127. These don't represent printable characters (although unicode provides characters at those positions). They are return, linefeed, tab, ctrl-c, formfeed and the like. Some of them are holdovers from the olden days of teletype and telex machines.
Notice how the paper tape has eight bit positions in each frame. Those are the bits of ASCII / Latin-1. 'Delete' aka Rubout is 127 or 0111 1111. Why? because it was possible to punch out all seven holes in the tape and so rub out a character.
That may account for the suggestion someone made to use a 256-position array to tabulate text in that kind of character set.
I believe the use of 128 and 256 in the same function is a mistake in that book edition. In the newer 6th edition (2016), the code example states:
and the author adds the comment:
It's OK to assume 256 characters. This would be the case in extended ASCII.
So, use either 128 or 256, not both, for that book exercise.
The author probably confused characters and bytes. You should also understand the related concept of encoding.
Hitman pro 3.8.0 product key. Hitman Pro 3 will quickly show if your PC is infected with malicious software.
A byte is eight bits. A byte was traditionally often used to store a character, though very early computers only required 7 bits to store a character. The ASCII standard for encoding characters in 7 bits was ratified in 1963, though at the time there were also competing character encodings (of which EBCDIC still survives to this day).
When you only use 7 of the available 8 bits, you might have ideas for what to do with the spare bit. One of the common approaches was to encode additional non-standard characters which were not available in the ASCII standard. A large number of legacy 8-bit encodings have been defined, some of which have been published as standards as well. Some are popular even to this day; some examples are ISO-8859-1 (aka Latin-1) and the Windows code pages (437, 850, and 1252 are still in common use in Western countries, despite their many drawbacks). Many of them are 'extended ASCII' encodings which are compatible with ASCII in the first 128 bytes; though the term 'extended ASCII' is not really technically well-defined.
If you are processing a sequence of bytes, you do want to be able to cope with byte values in the range 0-255 and not just the ones which are defined in ASCII. On the other hand, if you have guarantees that none of the bytes you are going to process will have values above 127 (such as, for example, if your input is known to be ASCII because it comes from a source which is incapable of producing anything else), it is excessive to reserve room for values you know you are not going to need.
Going forward, most modern systems use Unicode in one form or another. On Windows, and apparently still in Java, you should expect UTF-16; elsewhere, UTF-8 is rapidly becoming the de facto standard. Both of these require your code to be able to handle 8-bit bytes cleanly, though the code points are not (necessarily, in UTF-8, or ever, in UTF-16) encoded in a single byte.
As for the code you posted, you are correct that 128 character positions is enough if you discard any byte whose value is larger than 127. On the other hand, depending on what data you expect to process, discarding non-ASCII characters may not at all be the right thing to do; and then, if you don't discard anything, you do need to handle all 256.
Either way, if you only discard values larger than 128, you need 129 positions in the array (there are 129 integers in the range 0 through 128). This is probably just a silly off-by-one bug.
Devanagari देवनागरी | |
---|---|
Devanagari script (vowels top, consonants bottom) in Chandas font | |
Type | |
Languages | Hindi, Sanskrit, Pali, Prakrit, Apabhramsha, Awadhi, Bhojpuri, Braj Bhasha, Chhattisgarhi, Haryanvi, Magahi, Nagpuri, Rajasthani, Bhili, Dogri, Marathi, Maithili, Nepali, Kashmiri, Konkani, Sindhi, Newar, Bodo, Mundari, Gujarati, Hindustani, and many more |
Time period | Early signs: 1st century CE,[1] modern form: 10th century CE[2][3] |
Proto-Sinaitic[a]
| |
Gujarati Moḍī | |
Sister systems | Nandinagari |
Direction | Left-to-right |
ISO 15924 | Deva, 315 |
Devanagari | |
U+0900–U+097F Devanagari, U+A8E0–U+A8FF Devanagari Extended, U+1CD0–U+1CFF Vedic Extensions | |
[a] The Semitic origin of the Brahmic scripts is not universally agreed upon. | |
This article contains IPA phonetic symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Unicode characters. For an introductory guide on IPA symbols, see Help:IPA. |
Devanāgarī | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Brahmic scripts |
---|
The Brahmic script and its descendants |
|
|
Devanagari (/ˌdeɪvəˈnɑːɡəri/DAY-və-NAH-gər-ee; देवनागरी, IAST: Devanāgarī, Sanskrit pronunciation: [deːʋɐˈnaːɡɐɽiː]), also called Nagari (Nāgarī, नागरी),[4] is a left-to-right abugida (alphasyllabary),[5] based on the ancient Brāhmī script,[1] used in the Indian subcontinent. It was developed in ancient India from the 1st to the 4th century CE,[1] and was in regular use by the 7th century CE.[4][6] The Devanagari script, composed of 47 primary characters including 14 vowels and 33 consonants, is one of the most adopted writing systems in the world,[7] being used for over 120 languages.[8] The ancient Nagari script for Sanskrit had two additional consonantal characters.[9]
The orthography of this script reflects the pronunciation of the language.[8] Unlike the Latin alphabet, the script has no concept of letter case.[10] It is written from left to right, has a strong preference for symmetrical rounded shapes within squared outlines, and is recognisable by a horizontal line that runs along the top of full letters.[5] In a cursory look, the Devanagari script appears different from other Indic scripts such as Bengali, Odia, or Gurmukhi, but a closer examination reveals they are very similar except for angles and structural emphasis.[5]
Among the languages using it – as either their only script or one of their scripts – are Sanskrit, Hindi,[11]Nepali, Pali, Prakrit, Apabhramsha, Awadhi, Bhojpuri, Braj Bhasha, Chhattisgarhi, Haryanvi, Magahi, Nagpuri, Rajasthani, Bhili, Dogri, Marathi, Maithili, Kashmiri, Konkani, Sindhi, Bodo, Nepalbhasa, Mundari and Santali.[8] The Devanagari script is closely related to the Nandinagari script commonly found in numerous ancient manuscripts of South India,[12][13] and it is distantly related to a number of southeast Asian scripts.[8]
Devanagari is a compound of 'deva' देव and 'nāgarī' नागरी.[4]Deva meaning 'heavenly or divine', and is also one of the terms for a deity in Hinduism,[14]Nagri comes from नगर (nagar), which means abode or city. Hence, Devanagari denotes from the abode of divinity or deities.
Devanagari is part of the Brahmic family of scripts of India, Nepal, Tibet, and South-East Asia.[15][16] Some of the earliest epigraphical evidence attesting to the developing Sanskrit Nagari script in ancient India, in a form similar to Devanagari, is from the 1st to 4th century CE inscriptions discovered in Gujarat.[1] It is a descendant of the 3rd century BCE Brahmi script through the Gupta script, along with Siddham and Sharada.[16] Variants of script called Nāgarī, recognisably close to Devanagari, are first attested from the 1st century CE Rudradaman inscriptions in Sanskrit, while the modern standardised form of Devanagari was in use by about 1000 CE.[6][17] Medieval inscriptions suggest widespread diffusion of the Nagari-related scripts, with biscripts presenting local script along with the adoption of Nagari scripts. For example, the mid 8th-century Pattadakal pillar in Karnataka has text in both Siddha Matrika script, and an early Telugu-Kannada script; while, the Kangra Jvalamukhi inscription in Himachal Pradesh is written in both Sharada and Devanagari scripts.[18]
The Nagari script was in regular use by the 7th century CE and it was fully developed by about the end of first millennium.[4][6] The use of Sanskrit in Nagari script in medieval India is attested by numerous pillar and cave temple inscriptions, including the 11th-century Udayagiri inscriptions in Madhya Pradesh,[19] and an inscribed brick found in Uttar Pradesh, dated to be from 1217 CE, which is now held at the British Museum.[20] The script's proto- and related versions have been discovered in ancient relics outside of India, such as in Sri Lanka, Myanmar and Indonesia; while in East Asia, Siddha Matrika script considered as the closest precursor to Nagari was in use by Buddhists.[9][21] Nagari has been the primus inter pares of the Indic scripts.[9] It has long been used traditionally by religiously educated people in South Asia to record and transmit information, existing throughout the land in parallel with a wide variety of local scripts (such as Modi, Kaithi, and Mahajani) used for administration, commerce, and other daily uses.
.[22] Other closely related scripts such as Siddham Matrka were in use in Indonesia, Vietnam, Japan and other parts of East Asia by between 7th- to 10th-century.[23][24] Sharada remained in parallel use in Kashmir. An early version of Devanagari is visible in the Kutila inscription of Bareilly dated to Vikram Samvat 1049 (i.e. 992 CE), which demonstrates the emergence of the horizontal bar to group letters belonging to a word.[2] One of the oldest surviving Sanskrit texts from the early post-Maurya period consists of 1,413 Nagari pages of a commentary by Patanjali, with a composition date of about 150 BCE, the surviving copy transcribed about 14th century CE.[25]
Nāgarī is the Sanskrit feminine of Nāgara 'relating or belonging to a town or city, urban'. It is a phrasing with lipi ('script') as nāgarī lipi 'script relating to a city', or 'spoken in city'.[26]
The use of the name devanāgarī emerged from the older term nāgarī.[16] According to Fischer, Nagari emerged in the northwest Indian subcontinent around 633 CE, was fully developed by the 11th-century, and was one of the major scripts used for the Sanskrit literature.[16]
Most of the southeast Asian scripts have roots in the Dravidian scripts, except for a few found in south-central regions of Java and isolated parts of southeast Asia that resemble Devanagari or its prototype. The Kawi script in particular is similar to the Devanagari in many respects though the morphology of the script has local changes. The earliest inscriptions in the Devanagari-like scripts are from around the 10th-century, with many more between 11th- and 14th-century.[27][28] Some of the old-Devanagari inscriptions are found in Hindu temples of Java, such as the Prambanan temple.[29] The Ligor and the Kalasan inscriptions of central Java, dated to the 8th-century, are also in the Nagari script of North India. According to the epigraphist and Asian Studies scholar Lawrence Briggs, these may be related to the 9th-century copper plate inscription of Devapaladeva (Bengal) which is also in early Devanagari script.[30] The term Kawi in Kawi script is a loan word from Kavya (poetry). According to anthropologists and Asian Studies scholars John Norman Miksic and Goh Geok Yian, the 8th-century version of early Nagari or Devanagari script was adopted in Java, Bali (Indonesia), and Khmer (Cambodia) around 8th or 9th-century, as evidenced by the many inscriptions of this period.[31]
The letter order of Devanagari, like nearly all Brahmic scripts, is based on phonetic principles that consider both the manner and place of articulation of the consonants and vowels they represent. This arrangement is usually referred to as the varṇamālā 'garland of letters'.[32] The format of Devanagari for Sanskrit serves as the prototype for its application, with minor variations or additions, to other languages.[33]
The vowels and their arrangement are:[34]
Independent form | IAST/ ISO | As diacritic with प | Independent form | IAST/ ISO | As diacritic with प | |
---|---|---|---|---|---|---|
kaṇṭhya (Guttural) | अ | a | प | आ | ā | पा |
tālavya (Palatal) | इ | i | पि | ई | ī | पी |
oṣṭhya (Labial) | उ | u | पु | ऊ | ū | पू |
mūrdhanya (Retroflex) | ऋ | ṛ/r̥ | पृ | ॠ4 | ṝ/r̥̄ | पॄ |
dantya (Dental) | ऌ4 | ḷ/l̥ | पॢ | ॡ4,5 | ḹ/l̥̄ | पॣ |
kaṇṭhatālavya (Palatoguttural) | ए | e/ē | पे | ऐ | ai | पै |
kaṇṭhoṣṭhya (Labioguttural) | ओ | o/ō | पो | औ | au | पौ |
IAST | अं1 | aṃ/aṁ | पं | अः1 | aḥ | पः |
IAST | ॲ / ऍ7 | IAST/ê | पॅ | ऑ7 | IAST/ô | पॉ |
The table below shows the consonant letters (in combination with inherent vowela) and their arrangement. To the right of the Devanagari letter it shows the Latin script transliteration using International Alphabet of Sanskrit Transliteration,[43] and the phonetic value (IPA) in Hindi.[44][45]
Phonetics → | sparśa (Plosive) | anunāsika (Nasal) | antastha (Approximant) | ūṣman/saṃghaṣhrī (Fricative) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Voicing → | aghoṣa | saghoṣa | aghoṣa | saghoṣa | ||||||||||||
Aspiration → | alpaprāṇa | mahāprāṇa | alpaprāṇa | mahāprāṇa | alpaprāṇa | mahāprāṇa | ||||||||||
kaṇṭhya (Guttural) | क | ka [k] | ख | kha [kʰ] | ग | ga [ɡ] | घ | gha [ɡʱ] | ङ | ṅa [ŋ] | ह | ha [ɦ] | ||||
tālavya (Palatal) | च | ca [c]~[tʃ] | छ | cha [cʰ]~[tʃʰ] | ज | ja [ɟ]~[dʒ] | झ | jha [ɟʱ]~[dʒʱ] | ञ | ña [ɲ] | य | ya [j] | श | śa [ʃ] | ||
mūrdhanya (Retroflex) | ट | ṭa [ʈ] | ठ | ṭha [ʈʰ] | ड | ḍa [ɖ] | ढ | ḍha [ɖʱ] | ण | ṇa [ɳ] | र | ra [ɾ] | ष | ṣa [ʂ] | ||
dantya (Dental) | त | ta [t̪] | थ | tha [t̪ʰ] | द | da [d̪] | ध | dha [d̪ʱ] | न | na [n] | ल | la [l] | स | sa [s] | ||
oṣṭhya (Labial) | प | pa [p] | फ | pha [pʰ] | ब | ba [b] | भ | bha [bʱ] | म | ma [m] | व | va [ʋ] |
For a list of the 297 (33×9) possible Sanskrit consonant-(short) vowel syllables see Āryabhaṭa numeration.
[v] (the voiced labiodental fricative) and [w] (the voiced labio-velar approximant) are both allophones of the single phoneme represented by the letter 'व' in Hindi Devanagari. More specifically, they are conditional allophones, i.e. rules apply on whether 'व' is pronounced as [v] or [w] depending on context. Native Hindi speakers pronounce 'व' as [v] in vrat (व्रत, fast) and [w] in pakvān (पकवान, food dish), perceiving them as a single phoneme and without being aware of the allophone distinctions they are systematically making.[47] However, this specific allophony can become obvious when speakers switch languages. Non-native speakers of Hindi might pronounce 'व' in 'व्रत' as [w], i.e. as wrat instead of the more correct vrat. This results in a minor intelligibility problem because wrat can easily be confused for aurat,[citation needed] which means woman, instead of the intended fast (abstaining from food), in Hindi.[47]
Table: Compounds. Vowels in their independent form on the left and in their corresponding dependent form (vowel sign) combined with the consonant 'k' on the right. Chatmonchy has come rar. 'ka' is without any added vowel sign, where the vowel 'a' is inherent. ISO 15919[48] transliteration is on the top two rows.
ISO | a | ā | æ | ɒ | i | ī | u | ū | e | ē | ai | o | ō | au | r̥ | r̥̄ | l̥ | l̥̄ | ṁ | ḥ | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
a | ka | ā | kā | æ | kæ | ɒ | kɒ | i | ki | ī | kī | u | ku | ū | kū | e | ke | ē | kē | ai | kai | o | ko | ō | kō | au | kau | r̥ | kr̥ | r̥̄ | kr̥̄ | l̥ | kl̥ | l̥̄ | kl̥̄ | ṁ | kṁ | ḥ | kaḥ | k | |
Devanagari | अ | क | आ | का | ॲ | कॅ | ऑ | कॉ | इ | कि | ई | की | उ | कु | ऊ | कू | ऎ | कॆ | ए | के | ऐ | कै | ऒ | कॊ | ओ | को | औ | कौ | ऋ | कृ | ॠ | कॄ | ऌ | कॢ | ॡ | कॣ | अं | कं | अः | कः | क् |
A vowel combines with a consonant to form their compound letter. For example, the vowel आ (ā) combines with the consonant क् (k) to form the compound का (kā), with halant removed and added vowel sign which is indicated by diacritics. The vowel अ (a) combines with the consonant क् (k) to form the compound क (ka) with halant removed. But, the compound letter series of क, ख, ग, घ .. (ka, kha, ga, gha) is without any added vowel sign, as the vowel अ (a) is inherent.
As mentioned, successive consonants lacking a vowel in between them may physically join together as a conjunct consonant or ligature. When Devanagari is used for writing languages other than Sanskrit, conjuncts are used mostly with Sanskrit words and loan words. Native words typically use the basic consonant and native speakers know to suppress the vowel when it is conventional to do so. For example, the native Hindi word karnā is written करना (ka-ra-nā).[49] The government of these clusters ranges from widely to narrowly applicable rules, with special exceptions within. While standardised for the most part, there are certain variations in clustering, of which the Unicode used on this page is just one scheme. The following are a number of rules:
The pitch accent of Vedic Sanskrit is written with various symbols depending on shakha. In the Rigveda, anudātta is written with a bar below the line (◌॒), svarita with a stroke above the line (◌॑) while udātta is unmarked.
The end of a sentence or half-verse may be marked with the '।' symbol (called a daṇḍa, meaning 'bar', or called a pūrṇa virām, meaning 'full stop/pause'). The end of a full verse may be marked with a double-daṇḍa, a '॥' symbol. A comma (called an alpa virām, meaning 'short stop/pause') is used to denote a natural pause in speech.[51][52] Other punctuation marks such as colon, semi-colon, exclamation mark, dash, and question mark are currently in use in Devanagari script, matching their use in European languages.[53]
The following letter variants are also in use, particularly in older texts.[56]
standard | ancient |
---|
० | १ | २ | ३ | ४ | ५ | ६ | ७ | ८ | ९ |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
A variety of unicode fonts are in use for Devanagari. These include, but are not limited to, Akshar,[57]Annapurna,[58]Arial,[59]CDAC-Gist Surekh,[60]CDAC-Gist Yogesh,[61]Chandas,[62] Gargi,[63]Gurumaa,[64] Jaipur,[65] Jana,[66] Kalimati,[67] Kanjirowa,[68]Lohit Devanagari, Mangal,[69] Raghu,[70]Sanskrit2003,[71]Santipur OT,[62]Siddhanta, Thyaka,[72] and Uttara.[62]
The form of Devanagari fonts vary with function. According to Harvard College for Sanskrit studies, 'Uttara [companion to Chandas] is the best in terms of ligatures but, because it is designed for Vedic as well, requires so much vertical space that it is not well suited for the 'user interface font' (though an excellent choice for the 'original field' font). Santipur OT is a beautiful font reflecting a very early [medieval era] typesetting style for Devanagari. Sanskrit 2003[73] is a good all-around font and has more ligatures than most fonts, though students will probably find the spacing of the CDAC-Gist Surekh[60] font makes for quicker comprehension and reading.'[62]
Google Fonts project now has a number of new unicode fonts for Devanagari in a variety of typefaces in Serif, Sans-Serif, Display and Handwriting categories.
There are several methods of Romanisation or transliteration from Devanagari to the Roman script.[74]
The Hunterian system is the 'national system of romanisation in India' and the one officially adopted by the Government of India.[75][76][77]
A standard transliteration convention was codified in the ISO 15919 standard of 2001. It uses diacritics to map the much larger set of Brahmic graphemes to the Latin script. The Devanagari-specific portion is nearly identical to the academic standard for Sanskrit, IAST.[78]
The International Alphabet of Sanskrit Transliteration (IAST) is the academic standard for the romanisation of Sanskrit. IAST is the de facto standard used in printed publications, like books, magazines, and electronic texts with Unicode fonts. It is based on a standard established by the Congress of Orientalists at Athens in 1912. The ISO 15919 standard of 2001 codified the transliteration convention to include an expanded standard for sister scripts of Devanagari.[78]
The National Library at Kolkata romanisation, intended for the romanisation of all Indic scripts, is an extension of IAST.
Compared to IAST, Harvard-Kyoto looks much simpler. It does not contain all the diacritic marks that IAST contains. It was designed to simplify the task of putting large amount of Sanskrit textual material into machine readable form, and the inventors stated that it reduces the effort needed in transliteration of Sanskrit texts on the keyboard.[79] This makes typing in Harvard-Kyoto much easier than IAST. Harvard-Kyoto uses capital letters that can be difficult to read in the middle of words.
ITRANS is a lossless transliteration scheme of Devanagari into ASCII that is widely used on Usenet. It is an extension of the Harvard-Kyoto scheme. In ITRANS, the word devanāgarī is written 'devanaagarii' or 'devanAgarI'. ITRANS is associated with an application of the same name that enables typesetting in Indic scripts. The user inputs in Roman letters and the ITRANS pre-processor translates the Roman letters into Devanagari (or other Indic languages). The latest version of ITRANS is version 5.30 released in July, 2001. It is similar to Velthius system and was created by Avinash Chopde to help print various Indic scripts with personal computers.[79]
The disadvantage of the above ASCII schemes is case-sensitivity, implying that transliterated names may not be capitalised. This difficulty is avoided with the system developed in 1996 by Frans Velthuis for TeX, loosely based on IAST, in which case is irrelevant.
ALA-LC[80] romanisation is a transliteration scheme approved by the Library of Congress and the American Library Association, and widely used in North American libraries. Transliteration tables are based on languages, so there is a table for Hindi,[81] one for Sanskrit and Prakrit,[82] etc.
WX is a Roman transliteration scheme for Indian languages, widely used among the natural language processing community in India. It originated at IIT Kanpur for computational processing of Indian languages. The salient features of this transliteration scheme are as follows.
ISCII is an 8-bit encoding. The lower 128 codepoints are plain ASCII, the upper 128 codepoints are ISCII-specific.
It has been designed for representing not only Devanagari but also various other Indic scripts as well as a Latin-based script with diacritic marks used for transliteration of the Indic scripts.
ISCII has largely been superseded by Unicode, which has, however, attempted to preserve the ISCII layout for its Indic language blocks.
The Unicode Standard defines three blocks for Devanagari: Devanagari (U+0900–U+097F), Devanagari Extended (U+A8E0–U+A8FF), and Vedic Extensions (U+1CD0–U+1CFF).
Devanagari[1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+090x | ऀ | ँ | ं | ः | ऄ | अ | आ | इ | ई | उ | ऊ | ऋ | ऌ | ऍ | ऎ | ए |
U+091x | ऐ | ऑ | ऒ | ओ | औ | क | ख | ग | घ | ङ | च | छ | ज | झ | ञ | ट |
U+092x | ठ | ड | ढ | ण | त | थ | द | ध | न | ऩ | प | फ | ब | भ | म | य |
U+093x | र | ऱ | ल | ळ | ऴ | व | श | ष | स | ह | ऺ | ऻ | ़ | ऽ | ा | ि |
U+094x | ी | ु | ू | ृ | ॄ | ॅ | ॆ | े | ै | ॉ | ॊ | ो | ौ | ् | ॎ | ॏ |
U+095x | ॐ | ॑ | ॒ | ॓ | ॔ | ॕ | ॖ | ॗ | क़ | ख़ | ग़ | ज़ | ड़ | ढ़ | फ़ | य़ |
U+096x | ॠ | ॡ | ॢ | ॣ | । | ॥ | ० | १ | २ | ३ | ४ | ५ | ६ | ७ | ८ | ९ |
U+097x | ॰ | ॱ | ॲ | ॳ | ॴ | ॵ | ॶ | ॷ | ॸ | ॹ | ॺ | ॻ | ॼ | ॽ | ॾ | ॿ |
Notes
|
Devanagari Extended[1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+A8Ex | ꣠ | ꣡ | ꣢ | ꣣ | ꣤ | ꣥ | ꣦ | ꣧ | ꣨ | ꣩ | ꣪ | ꣫ | ꣬ | ꣭ | ꣮ | ꣯ |
U+A8Fx | ꣰ | ꣱ | ꣲ | ꣳ | ꣴ | ꣵ | ꣶ | ꣷ | ꣸ | ꣹ | ꣺ | ꣻ | ꣼ | ꣽ | ꣾ | ꣿ |
Notes
|
Vedic Extensions[1][2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1CDx | ᳐ | ᳑ | ᳒ | ᳓ | ᳔ | ᳕ | ᳖ | ᳗ | ᳘ | ᳙ | ᳚ | ᳛ | ᳜ | ᳝ | ᳞ | ᳟ |
U+1CEx | ᳠ | ᳡ | ᳢ | ᳣ | ᳤ | ᳥ | ᳦ | ᳧ | ᳨ | ᳩ | ᳪ | ᳫ | ᳬ | ᳭ | ᳮ | ᳯ |
U+1CFx | ᳰ | ᳱ | ᳲ | ᳳ | ᳴ | ᳵ | ᳶ | ᳷ | ᳸ | ᳹ | ᳺ | |||||
Notes
|
InScript is the standard keyboard layout for Devanagari as standardized by the Government of India. It is inbuilt in all modern major operating systems. Microsoft Windows supports the InScript layout (using the Mangal font), which can be used to input unicode Devanagari characters. InScript is also available in some touchscreen mobile phones.
This layout was used on manual typewriters when computers were not available or were uncommon. For backward compatibility some typing tools like Indic IME still provide this layout.
Such tools work on phonetic transliteration. The user writes in Roman and the IME automatically converts it into Devanagari. Some popular phonetic typing tools are Akruti, Baraha IME and Google IME.
The Mac OS X operating system includes two different keyboard layouts for Devanagari: one is much like INSCRIPT/KDE Linux, the other is a phonetic layout called 'Devanagari QWERTY'.
Any one of Unicode fonts input system is fine for Indic language Wikipedia and other wikiprojects, including Hindi, Bhojpuri, Marathi, Nepali Wikipedia. Some people use inscript. Majority uses either Google phonetic transliteration or input facility Universal Language Selector provided on Wikipedia. On Indic language wikiprojects Phonetic facility provided initially was java-based later supported by Narayam extension for phonetic input facility. Currently Indic language Wiki projects are supported by Universal Language Selector (ULS), that offers both phonetic keyboard (Aksharantaran, Marathi: अक्षरांतरण, Hindi: लिप्यंतरण, बोलनागरी) and InScript keyboard (Marathi: मराठी लिपी).
The Ubuntu Linux operating system supports several keyboard layouts for Devanagari, including Harvard-Kyoto, WX notation, Bolanagari and phonetic. The 'remington' typing method in Ubuntu IBUS is similar to the Krutidev typing method, popular in Rajasthan. The 'itrans' method is useful for those who know English well (and the English keyboard) but not familiar with typing in Devanagari.
.. In the Kutila this develops into a short horizontal bar, which, in the Devanagari, becomes a continuous horizontal line .. three cardinal inscriptions of this epoch, namely, the Kutila or Bareli inscription of 992, the Chalukya or Kistna inscription of 945, and a Kawi inscription of 919 .. the Kutila inscription is of great importance in Indian epigraphy, not only from its precise date, but from its offering a definite early form of the standard Indian alphabet, the Devanagari ..
Nagari has a strong preference for symmetrical shapes, especially squared outlines and right angles [7 lines above the character grid]
(p. 110) '.. an early branch of this, as of the fourth century CE, was the Gupta script, Brahmi's first main daughter. [..] The Gupta alphabet became the ancestor of most Indic scripts (usually through later Devanagari). [..] Beginning around AD 600, Gupta inspired the important Nagari, Sarada, Tibetan and Pali scripts. Nagari, of India's northwest, first appeared around AD 633. Once fully developed in the eleventh century, Nagari had become Devanagari, or 'heavenly Nagari', since it was now the main vehicle, out of several, for Sanskrit literature.'
.. showed extremely regular patterns. As is not uncommon in a study of subphonemic detail, the objective data patterned much more cleanly than intuitive judgments .. [w] occurs when /व/ is in onglide position .. [v] occurs otherwise ..CS1 maint: Multiple names: authors list (link)
.. With the passage of time there has emerged a practically uniform system of transliteration of Devanagari and allied alphabets. Nevertheless, no single system of Romanisation has yet developed ..
.. ISO 15919 .. There is no evidence of the use of the system either in India or in international cartographic products .. The Hunterian system is the actually used national system of romanisation in India ..
.. In India the Hunterian system is used, whereby every sound in the local language is uniformly represented by a certain letter in the Roman alphabet ..
.. The Hunterian system of transliteration, which has international acceptance, has been used ..
Thousands of manuscripts of ancient and medieval era Sanskrit texts in Devanagari have been discovered since the 19th century. Major catalogues and census include:
Wikibooks has a book on the topic of: Devanagari |
Wikimedia Commons has media related to Devanagari stroke order and Devanagari pronunciation. |
Wikivoyage has a travel guide for Learning Devanagari. |