Challenges amp Potential Iman Dagher UCLA PCC Participants Meeting January 26 2020 ALA Midwinter Conference Philadelphia PA NonLatin Scripts in Libraries Data Chinese 汉字 漢字 ID: 813214
Download The PPT/PDF document "Path to Discovery! Romanization & Sc..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Path to Discovery!Romanization & Scripts for Non-Latin/Arabic MaterialsChallenges & Potential
Iman Dagher, UCLA
PCC Participants Meeting, January 26, 2020
ALA Midwinter Conference, Philadelphia, PA
Slide2Non-Latin Scripts in Libraries’ Data
Chinese 汉字 漢字
Arabic
ا
لعربية
Devanagari देवनागरी Bengali-Assamese বাংলা-অসমীয়াCyrillic Кириллица Kana かな カナ Hangul 한글 조선글 Telugu తెలుగు Tamil தமிழ் Gujarati ગુજરાતી
Kannada ಕನ್ನಡMalayalam മലയാളംThai ไทยGurmukhi ਗੁਰਮੁਖੀLao ລາວHebrew אלפבית Khmer ខ្មែរArmenian ՀայոցMongolian ᠮᠣᠩᠭᠣᠯBurmese မြန်မာ
2
Slide3ObjectivesRomanization: Challenges and PotentialRomanization: Challenges with Arabic materials Potential of
adding scripts Scripts within Name Authority File
Adding scripts retrospectively: UCLA project
3
Slide4PCC Practices
RDA instructs catalogers to transcribe data in the language and script found in the resource. The LC-PCC PS 1.4 is to apply the first alternative, i.e. to record the elements in a transliterated form
Follow the ALA-LC Romanization TablesUse of MARC Model A for Bibliographic
records
vernacular and transliteration The original-script fields are coded as 880 parallel fields in bibliographic recordsIn OCLC, parallel fields display as the same MARC tags as their linked Latin equivalentField 066 Use of MARC Model B for authority records provides unlinked non-Latin script fieldsAdding scripts is optional but recommended4
Slide5Romanization in Practice: Advantages
Romanized data is used by different tasks of the libraries
Ability to integrate materials into library-wide processing and handling procedures: acquisition, shelf-listing, serials check-in, circulation, preservation, reference, interlibrary loan, etc.
Not all library staff are required to have a specialized language knowledge
Ability to find resources even if you don’t know the language
Some library systems do not support all non-Latin scripts5
Slide6Romanization in Practice: Challenges
Lack of consistent Romanization Romanization is not perfect: lack of consensus
Record duplicationsALA-LC Romanization criticism: ambiguous, complex, unrepresentative
Different Romanization tables: ALA-LC; ISO-based Romanization; German; etc.
Patron confusion and frustration in discovering and correctly citing library materials
Training Issues: library staff & patrons Cataloging challenges: searching and recording Need to maintain macrosRomanization issues with certain languages6
Slide7Arabic Script, Arabic Language: Some Facts
Arabic is one of the most widely used scripts in the world Around 660 million individuals use Arabic script to communicate in a number of languages, including: Urdu, Pashto, Arabic, Punjabi, Persian, Malaysian, and Kurdish
The Arabic language is a Semitic language with about 221 million speakers, and spoken in more than 34 countries
Over 30 different varieties of colloquial Arabic
Modern Standard Arabic is the universal language of the Arabic-speaking world, which is understood by all Arabic speakers
Arabic is written from right to left7
Slide8Romanization: Challenges with Arabic materials
Following certain rules in the ALA-LC Romanization tables for Arabic can be challenging
Requires special training and skills to master the ALA-LC Romanization table for the Arabic language (total of 26 rules)
Time-consuming with searching and locating the records (ISBN not always present or correct)
Different sources to consult; some sources such as Hans
Wehr suggest 2 Romanized forms for certain wordsRequires adding 246 (other forms of titles)Romanizing certain titles requires a familiarity with the culture Arabic language relies on Tashkil (or Tahrik), i.e. vocalization 8
Slide9Romanization: Issues with the Arabic Language
Tashkil is adding the different diacritics to Arabic letters in order to indicate vowels, or lack thereof, and gemination
Arabic texts are mostly written without tashkil in library resources. Fluent speakers are able to automatically fill in these diacritics themselves
Arabic is a highly inflected language. Romanization requires a good grammatical knowledge
©
Mamoun Sakkal 19979
Slide10Romanization: Issues with the Arabic Language
Romanizing foreign words:
Bibliography =ببليوجرافيا
bibliyūjirāfiyā
bibliyu
̄jrāfiyā bibliyūjrāfyā = ببليوغرفيا bibliyūghrafiyā bibliyūghrafyā bibliyūghirafiyā 10
Slide11Romanization: Issues with the Arabic LanguageThe position of
hamzah [
ء] on the
أ إ
will affect the beginning vowel a, u, i 1. imām = leader إمام amāma = in front أمام 2. Africa: إفريقيا Ifriqiyā OR أفريقيا
Afriqiyā Erbil: أربيل Arbīl OR إربيل Irbīl [Irbīl in NAF] 11
Slide12Romanization: Issues with the Arabic LanguageGeographic names
Inconsistency America
أميركا Amīrikā; Amīrka
̄
أمريكا Amrīkā Authorized forms vs. commonly-used namesIn NAF: Rabat (Morocco) suggests the Romanizing as: Rabāṭ But most commonly known is: Ribāṭ الرِباط 12
Slide13Romanization: Issues with Arabic Personal Names
Sometimes it is difficult to predict how a name is pronounced with a lack of diacritics
حَسَن
Ḥasan name for male حُسْن Ḥusn name for femaleNames may be pronounced differently depending on the origin of the authorMunassá13
Slide14Romanization: Issues with Arabic Personal Names
Issue with normalization: searching for Salim can retrieve two different names
سليم [Salīm
]
سالم [Sālim]North African names have their special challenges with Romanization since pronunciation may be influenced by the French languageIssues with Latin names in Arabic script Shakespeare, Williamشكسبير، وليام More variants are recommended Adding script variants can be the solutionSalimShakspīr, Walyam; Shakspīr, Wilyam; Shaykspīr,
Wilyam14
Slide15Romanization: Issues with Arabic Personal Names Authors who write in both Arabic & PersianDifferent romanization tables
Variant Romanized forms are necessaryAdding scripts would solve the issue
Example:Jawādi
̄
Āmuli
̄, ʻAbd Allāh (Arabic) Javādī Āmulī, ʻAbd Allāh (Persian)جوادي آملي، عبد الله15
Slide16Romanization: Issues with Arabic Personal Names: Duplicates
Jubrān, Thurayā, $d 1952- no2015104931
Jubrān, Thurayyā, $d 1952- n 2016006592
Jibrān
,
Thurayā, $d 1952- ʻImmīsh, Ibrāhīm Fatḥī no2011054221 ʻUmaysh, Ibrāhīm Fatḥi n 2011037693 جبران، ثريا، 1952 -عميش، ابراهيم فتحي16
Ḥamīd, ʻAbd al-Laṭīf ibn Muḥammad no 95050509 Ḥumayd, ʻAbd al-Laṭīf ibn Muḥammad n 93902168 Ḥumayyid, ʻAbd al-Laṭīf ibn Muḥammadحميد، عبد اللطيف بن محمد
Slide17010 no2012067640 $z no2006125278 $z nb2008017303100 1 Manīsī,
Aḥmad373 Markaz
al-Dirāsāt al-Siyāsīyah
wa-al-Istirātījīyah
$2
naf373 Markaz al-Imārāt lil-Dirāsāt wa-al-Buḥūth al-Istirātījīyah $2 naf375 male377 ara400 1 منيسي، أحمد 4001 Minīsī, Aḥmad400 1 Munīsī, Aḥmad400 1 Minnīsī, Aḥmad400 1 Munaysī, Aḥmad
400 1 Munayyisī, Aḥmad400 1 Menese, Ahmed667 Non-Latin script reference not evaluated.670 Niqābat al-Mihan al-Riyāḍīyah, 2004: $b t.p. (Aḥmad Manīsiأحمد منيسي = ̄)17
Slide18Value of Scripts for Arabic Materials Precision in discovery
More efficient cataloging practicesMore legible and understandable metadata for the patrons
Globalization
18
Slide19Adding Scripts: Considerations
Using Macros is very helpful, but additional review is needed Use of different scripts with different directionality in one field may affect the display
Not all scripts are available in authority file: e.g. Armenian
Display of scripts: problem with certain diacritics
19
Slide20UCLA Project: Adding Scripts to Legacy Data
Adding scripts in OCLC for monographs in Russian Cyrillic in bibliographic record fields held by UCLA Russian: generally a one-to-one transliteration
About 54 thousand records for Monographs (excluding mixed script records)
Using an in-house process, working with a library IT programmer, replacing the master records in OCLC
Fields: 245 ($
abcpn), 250, 26X ($abc), 490 $aSample records will be reviewed by librarians with language expertise The processed/replaced records in OCLC will be identifiable by a 588 field: “UCLA Machine-derived non-Latin script bibliographic record project” Future potential projects: Armenian; smaller number of records and easier to tackle/review20
Slide21Sources Consulted Agenbroad
, J. E. (2006) Romanization Is Not Enough, Cataloging & Classification Quarterly, 42:2, 21-34. retrieved from https://www.tandfonline.com/doi/pdf/10.1300/J104v42n02_03?needAccess=true
ALCTS Non-English Access Working Group on Romanization Report (2009, December 15).
http://www.ala.org/alcts/sites/ala.org.alcts/files/content/ianda/nonenglish/apd15a.pdf
Johnson
, Chr. B. (December 2012). An Introduction to the ALA-LC Romanization Tables, SCATNews, 38, retrieved from https://www.ifla.org/files/assets/cataloguing/scatn/scat-news-38.pdfOn Romanization (August 2019). Multilingual Library: A blog devoted to multilingual issues in the library catalog. Retrieved from https://elegantlexicon.com/libPCC Guidelines for Creating Bibliographic Records in Multiple Character Sets Report (2017, September). https://www.loc.gov/aba/pcc/bibco/documents/PCCNonLatinGuidelines.pdfVernon, E. (1996). Decision-making for automation : Hebrew and Arabic script materials in the automated library. Retrieved from https://www.ideals.illinois.edu/handle/2142/3879The World's Most Popular Writing Scripts (October 23, 2019). Retrieved from https://www.worldatlas.com/articles/the-world-s-most-popular-writing-scripts.html 21
Slide22شُكْراً
Shukran
Thank you
22