THE CHAIM ROSENBERG SCHOOL OF JEWISH STUDIES RESEARCH SERIES
Studies in the Spoken Language and in Linguistic Variation in Israel
With the Assistance of
TEL-AVIV UNIVERSITY 2002
TABLE OF CONTENTS
* The articles marked with an asterisk are products of the research workshop of The Corpus of Spoken Israeli Hebrew (CoSIH) (Atlanta 2000). These articles have been published in their original, English form in: Benjamin H. Hary (ed.), Corpus Linguistics and Modern Hebrew: Towards the Compilation of The Corpus of Spoken Israeli Hebrew (CoSIH), Tel Aviv: Tel Aviv University, The Chaim Rosenberg School of Jewish Studies, 2003.
FORWARD by Yair Hoffman,
PREFACE by Shlomo Izre'el
Corpus Linguistics and Computational Linguistics
* John Sinclair CORPUS
LINGUISTICS: THE STATE OF THE ART
* John Sinclair LEXICAL GRAMMAR: A NEW LOOK AT LANGUAGE
Shuly Wintner HEBREW COMPUTATIONAL LINGUISTICS: PAST AND FUTURE
Language and Society in Israel
* Eliezer Ben-Rafael MULTICULTURALISM
AND MULTILINGUALISM IN ISRAEL
Muhammad Amara HEBREW AMONG THE ARABS IN ISRAEL: SOCIOLINGUISTIC ASPECTS
* Otto Jastrow THE CORPUS OF SPOKEN PALESTINIAN ARABIC (COSPA)
* Elana Shohamy and Bernard Spolsky FROM MONOLINGUAL TO MULTILINGUAL? EDUCATIONAL LANGUAGE POLICY ISRAEL
* Yaakov Bentolila LINGUISTIC
VARIATION ACROSS GENERATIONS IN ISRAEL
Ora (Rodrigue) Schwarzwald LANGUAGE VARIETIES IN CONTEMPORARY HEBREW
Zohar Livnat ON LANGUAGE, LAW, AND SOCIAL JUSTICE
Spoken Hebrew in Israel and Itâs Study
Moshe Bar-Asher MODERN
HEBREW AND ITS CLASSICAL BACKGROUND
* Shlomo Izreâel THE EMERGENCE OF SPOKEN ISRAELI HEBREW
* Shmuel Bolozky PHONOLOGICAL AND MORPHOLOGICAL VARIATION IN SPOKEN HEBREW
* Geoffrey Khan THE STUDY OF MODERN HEBREW SYNTAX
Yael Reshef THE SOCIOLINGUISTIC PHENOMENON OF V FORM IN HEBREW IN THE BRITISH MANDATE PERIOD
Ron Kuzar THE SIMPLE IMPERSONAL CONSTRUCTION IN TEXTS REPRESENTED AS COLLOQUIAL HEBREW
Esther Borochovsky - Bar Aba BETWEEN SPOKEN AND WRITTEN LANGUAGE: EXAMINING PARALLEL SPOKEN AND WRITTEN TEXT
Yitzhak Shlesinger POLARITY IN LANGUAGE LEVELS IN LITERARY TEXTS
Tamar Sovran SPOKEN AND POETIC LANGUAGE IN ISRAELI MODERN POETRY
Il-Il Yatziv FROM TRANSCRIPTION OF SPOKEN TEXT TO ITS REPRESENTATION ON A GRID SET
Toward the compilation of The Corpus of Spoken Israeli Hebrew (CoSIH)
* Giora Rahav POPULATION
SAMPLING FOR THE ESTABLISHMENT OF A REPRESENTATIVE CORPUS
* Benjamin Hary and Shlomo Izreâel THE PREPARATORY MODEL OF THE CORPUS OF SPOKEN ISRAELI HEBREW (COSIH)
* Regina E. Werum METHODOLOGICAL REMARKS ON CREATING THE CORPUS OF SPOKEN ISRAELI HEBREW (CoSIH)
Of the 17 volumes of TeÎuda that have preceded this one, 8 are inter-disciplinary in nature, reflecting a wide range of research topics in Jewish Studies. The other volumes each focused on a major topic: the Cairo genizah (1; 15); the Bible (2); the Talmud, Biblical commentary, and Jewish Thought (3; 13); Hebrew literature (5); Hebrew and Arabic (6; 9); and Jewish-Arabic cultural encounters in the Middle Ages (14). The eighteenth volume of TeÎuda, which we are happy to present to our readers, belongs to this second group, but it also introduces an innovative note: This is the first volume that does not focus on written texts and is entirely devoted to a research topic that is still in its infancy and deals with the present. This represents a statement, both academic and cultural: Even in the field of Jewish Studies it is important and possible to study not only written texts, but also the present and the evolving, which have not yet been immortalized in writing; the Hebrew language that is taking form. What belongs more to this category than spoken language? It is an important topic of research, which goes beyond the narrow realm of linguistic study to a study of Israeli society in its entirety. The living, spoken Hebrew language is one of the most striking achievements, some maintain the most important achievement, of the Israeli experience, because it is at once a formative factor and a mirror of the society in which we live. As such, the Hebrew language is deserving of a research tool, research methods, and scholars who will develop them. I hope that this volume will serve to advance these goals.
The subject of this volume being offered to scholars of the Hebrew language and its devotees is spoken Hebrew. Its nucleus is a meeting of scholars that occurred in February 2000 at Emory University in Atlanta (Georgia, USA). Papers were delivered by team members of The Corpus of Spoken Israeli Hebrew (CoSIH) and other scholars invited for this purpose, and questions were discussed pertaining to the compilation and design of the corpus. At the initiative of Yair Hoffman, editor of the TeÎuda series, other scholars engaged in the research of the Hebrew language were asked to contribute from their research to this volume. These articles, to a large extent, supplement the writings of the CoSIH workshop seminar, because they show ÷ from various and diverse aspects ÷ the pressing need for compiling a corpus of spoken Israeli Hebrew.
The study of spoken Hebrew is still in swaddling clothes, perhaps even in gestation. Without a corpus of data, there cannot be any research. Random data retrieval enables observation and insights. Authentic research that stems from research hypotheses and includes an examination of the data, followed either by corroboration or by refutation of the hypotheses (not only theory for its own sake), is possible only in a limited way when the research has no proper data base. Therefore, this volume should be properly valued for what it is: a proposal for directions of research into spoken Hebrew.
The research workshop in Atlanta consisted of several sections, and this publication essentially preserves that organization. The first section deals with the methodology of language research according to corpora. The section opens with two articles by John Sinclair, a professor at the University of Birmingham (England) and the founder and director (together with Elana Tognini-Bonelli) of the Tuscan Word Center in Italy. John Sinclair is one of the founding fathers of the linguistic method that sees the language corpus as a fitting basis for a new view of language. The first article presents the reader with the state of the art in corpus linguistics; in his second article, there is a survey of the achievements of research in corpus linguistics in the innovative field of ãlexical grammarä, a field of research in which Sinclairâs own contribution is definitive. The tremendous momentum accorded to language research supported by language corpora would not have been possible without the major developments in the computer sciences and computerization technology of the past decades. In general, no comprehensive linguistic research today is possible without the use of computers. Shuly Wintner, a computational linguist, was asked to survey the state of the art in this field in the study of Hebrew, and his article concludes the first section.
The second section consists of four articles and presents a profile of the relationship between language and society in Israel. Eliezer Ben-Rafael, President of the International Institute of Sociology and President of the Israeli Association for the Study of Language and Society, reviews the cultural and linguistic variety in our country. Muhammad Amara, a researcher of Arabs in Israel and their languages, turns his scrutiny to the Hebrew spoken by Arabs in Israel and asks about the attitude of Arabs toward Hebrew. Otto Jastrow, an expert in Arabic dialectology, reviews the status of the compilation of The Corpus of Spoken Palestinian Arabic (CoSPA). Elana Shohamy and Bernard Spolsky, whose expertise is in the linguistic policy, in language education and issues of language and society, discuss the policy of language education in Israel.
The third section contains three articles that deal with linguistic variation. Language variation may be the result of demographic variance or the result of language use in different discourse contexts. Yaakov Bentolila, a distinguished scholar of spoken Hebrew and Jewish languages, and one of the first to recognize the need to examine spoken Hebrew as a subject of research in its own right, shows that age, a demographic variable, is a dominant variable in language use. Ora Schwarzwald, one of the leaders in the study of Israeli Hebrew, proposes a classification of language varieties according to either demographic or contextual criteria. This section concludes with an article by Zohar Livnat that shows the need to study language varieties and their relationships as she observes the language of law and its users.
The fourth section is devoted to a study of language itself. The first article is by Moshe Bar-Asher, President of the Academy of Hebrew Language and an important scholar of Hebrew and Jewish languages, who shows how Modern Hebrew continues the tradition of past generations in its creative aspect. The creative aspect is demonstrated, among other ways, by the uses of spoken Hebrew ÷ uses that would no longer be recognized as novel and, therefore, constitute evidence that they are essential to the full and complete vitality of our language. In the second article of the section, I propose a new approach to observing the formative processes of spoken Israeli Hebrew in the early twentieth century. Shmuel Bolozky, an eminent scholar of contemporary Hebrew, conducts a comprehensive, detailed survey of phonological and morphological variation in spoken Hebrew, a survey that draws largely upon his own long-term and productive occupation with the subjects of his research. Geoffery Khan, a specialist in Semitic linguistics and the syntax of Semitic languages, reviews the state of research into Modern Hebrew syntax. The next five articles test observation methods in the relationship between the spoken language and the written language. Yael Reshef examines the formal terms of address (V form) in correspondence during the British mandate, which she found in the archives of the City of Tel Aviv, and tries to evaluate whether this stemmed from a spoken level of Hebrew of some kind. Yitzhak Shlezinger, who has dealt at length with language varieties in the press, takes a look, in his article, at the polarity between the language used in narratives and dialogues in Hebrew literature, a form of language that tries to represent spoken language. Ron Kuzar investigates the simple impersonal construction in language represented in literature as spoken language. Tamar Sovran presents the entry of spoken Hebrew into the Hebrew poetry of recent years and examines the sources of this change. Esther Borochovski-Bar Abba evaluates the differences between parallel texts that appear in speaking and writing and draws preliminary conclusions about the unique characteristics of the spoken language. When we are in possession of a large corpus of spoken Hebrew, says Kuzar, it will be possible to examine the validity of ideas proposed here on the basis of a solid methodology, and this is true of any investigation of spoken language using the data available today. The last article in this section is by Il-Il Yatziv, a student of Claire Blanche-Benveniste, an outstanding linguist working on spoken French, who developed a fascinating method for investigating the syntax of a spoken language. Yatziv applied this method to Hebrew and presents its principles here.
The last section presents reviews of the methodology to be employed in the compilation of The Corpus of Spoken Israeli Hebrew (CoSIH). Giora Rahav, who is responsible for the statistical and social aspects of the CoSIH framework, writes about population sampling and establishing a representative corpus of language. Benjamin Hary and I, the initiators of CoSIH, who together headed the planning team, present the corpus design, as presented before the CoSIH team in February 2000. This design is the design of a model before application. A methodological perusal of this design by Regina Werum, from a sociological viewpoint, concluded the CoSIH workshop and concludes this volume.
My warmest thanks to my colleague Benjamin Hary, who must be credited with the success of the Atlanta conference and the launching of the long-term CoSIH project. My thanks to members of the CoSIH team for their participation and support. Thanks to Giora Rahav for his help in editing the articles that deal with social questions. Thanks to Yair Hoffman, Director of the Rosenberg School of Jewish Studies and to members of the schoolâs publication committee, who recognized the projectâs importance and supported the publication of the proceedings of the workshop in this expanded and admirable format. In conclusion, kudos to Margalit Mendelson, who worked so devotedly with me in editing this volume.
Linguistics: the State of the Art
© John Sinclair 2002
This paper is a brief indication of topics of current interest among those working with corpora of languages.
The context of the debate is the recognition that the computer processing of language in general has not led to useful applications, despite great optimism. The conclusion is that the theoretical frameworks which were devised long before corpus evidence was available are inadequate for the task, and the big question is whether the large amount of new evidence coming from corpora will sufficiently improve the situation.
The important issues in corpus linguistics at the end of the year 2000 are matters of the overall size of corpora, and the methods by which they are annotated.
Size. Anyone committed to the description or exploitation of a natural language has to confront the problems of ãopen textä ÷ what millions of people write and say in a wide variety of situations. Open text is so multifarious that a corpus which will capture sufficient evidence of most of the meaningful patterns of it will be very large indeed by our present standards, containing probably billions of words. Current research suggests strongly that combinations of several words are often the minimal units of meaning, with corresponding implications for size.
Annotation. It makes good sense to keep a record of any analysis of a text that may be useful in an application. An unfortunate practice, dating from the days when computers were much less powerful and flexible than today, is to intersperse annotations and words in a text, and hold as a corpus a number of such texts. The corpus is irredeemably corrupted by this practice, which is unnecessary today. Analyses can be separately stored and interspersed with the text if and when required in the course of an application.
Another practice that should be avoided, given what is entailed in the size issue, is the manual analysis of corpora. Although there are drawbacks to automatic analysis, there are also important gains, and since the corpora of the future will be too large to handle manually, the effort of analysis should be focused more and more on the automatic process.
In the world of information science outside linguistics, it is necessary to discover something of the meaning of texts in order to arrange for their retrieval by automatic search engines. The prevailing view in this scientific area is that language is not a sufficient self-organising medium, and some other model of information has to be applied. There are many models and hundreds of research groups trying to improve access to language text because most of the internet is presented in written language. So far, linguists have failed to show that language is sufficiently organised to be approached directly. This point links back to the reluctance to explore automatic analysis, and to the issue of size.
The ultimate question is whether a computer could eventually be
programmed to comprehend natural language text in a similar way to a
user of the language. So far no genuinely intractable feature of
has been adduced, though everyone agrees that it is a formidable task.
Grammar: A New Look at Language
© John Sinclair 2002
Linguistics traditionally divides the meaning-making patterns of language into two types. One, usually called grammar or syntax, concentrates on the frequent, abstract arrangements of words into phrases, phrases into clauses, etc. The particular quality of individual words is played down and ignored as much as possible. Words are divided into those common ones that regularly indicate grammatical constructions, and the rest, whose meaning is handled in dictionaries, where the grammatical disposition of the words is an occasional and minor matter. This type of meaning making is usually called semantics.
The early findings from the analysis of large language corpora are that this very basic distinction may not be necessary ÷ in fact it may act as a distorting lens for researchers. Dividing the types of pattern weakens both of them; it becomes impossible in practical terms for a grammar to state comprehensively the classes that operate in structures, and it becomes impossible to describe comprehensively the meanings of words since so many meanings involve word combinations which in turn involve grammatical structures.
It is accepted that the meanings associated with grammar are not confined within it, but can also be created by semantic processes, and by the combination of semantics and grammar. Polarity, modality, tense etc. are examples. Also the description of pragmatic and similar meanings often requires that words in combination are assigned meanings that they do not normally have when examined individually.
Corpora are oriented to the study of words and word combinations rather than abstractions, and so it is natural for lexically oriented descriptions to emerge at the early stages of research with this novel resource. As expected by almost all commentators, the combinations that create meaning appear at first sight to be incompatible with the familiar grammatical categories, and there is clearly a process of reconciliation to be carried out by scholars with some generosity of outlook. At present, grammarians are extending the scope of grammar on an ad hoc basis to include some sensitivity to lexis, while lexicologists are becoming more aware of the importance of syntactic organisation in multi-word lexical units.
The key factor for future developments is the discovery that new
meanings and new shades of meaning are revealed as the putative units
meaning incorporate more words ÷ as they are seen maximally
minimally. This shift of emphasis reduces the role of grammar because
restricts choice at places in structure, and it remains to be seen
it will stop.
Computational Linguistics: Past and Future
Computational linguistics is a research area that lies at the intersection of linguistics and computer science. Computational linguistics can be viewed in two ways: On the one hand, it is the application to linguistics of various techniques of and results from computer science, for the purpose of investigating such fundamental problems as what people know when they know a natural language, what they do when they use this knowledge, and how they acquire this knowledge in the first place. On the other hand, computational linguistics is the application of various techniques of and results from linguistics to computer science, to provide such novel products as computers that can understand everyday human speech; translate among different human languages; and otherwise interact linguistically with people in ways that suit people, rather than computers. This latter view is usually known as Natural Language Processing.
This paper focuses on natural language processing, that is, on computational applications that necessitate linguistic knowledge or emulate language capabilities. Examples of such applications include machine translation from one natural language to another; conversion of speech to text and text to speech; natural language interfaces for computational systems; automatic summarization of documents; spelling and style checking; and so forth. Advances in this field are extremely important: Natural language interaction with computational systems will make such systems more accessible, and will enable more users, including users who are not computer literate, to benefit from the advantages of the systems. Understanding and generation of speech will result in systems for automatic voice interaction and will enable hand-free (or remote-controlled) operation of machines and systems. Speech-to-text conversion will enable hearing impaired persons to ãhearä telephone conversations; text-to-speech conversion will let blind people ãreadä their e-mail. Automatic translation will result in huge savings, when a userâs manual of some product is written in one language only and then translated automatically to the languages of the countries in which it is sold. The field has unlimited possibilities, but the current state of the art is such that only a few of these possibilities have been actualized, and then often in a far from adequate manner.
We concentrate, in this paper, on natural language processing
for the Hebrew language. We show that Hebrew poses additional problems
for developers of programs for language processing, mainly due to its
morphology and deficient script. We briefly review the field of
linguistics and the problems with which it deals, emphasizing the
problems involved in processing the Hebrew language. We then survey
systems developed for Hebrew; to the best of our knowledge, this survey
covers all works published to date. Finally, we attempt to identify
needs and suggest directions for future progress.
and Multilingualism in Israel
Numerous cases of cleavages divide Israeli society and have an
on language ÷ from the Ethiopian Jews to the Circassians. The
discussed here confirm that multiculturalization is transforming
society. It profoundly alters the nature of the project of
by recognizing a variety of interacting and intermingling
The dominant culture appears to lose its impact on the setting, at the
pace of the strengthening of sociocultural groups. The question that
arises, of course, concerns the reality created by the void left by the
slackening of the dominant orientation and the increasing fragmentation
of the setting. It appears, in this respect, that beyond each cleavage
confronting the dominant culture by asserting its contrastive
each sociocultural group is, at the same time, also significantly
in a variety of manners, to that same dominant culture. The analysis
here the importance of the acquisition of Hebrew by all groups,
Hebrew is given different kinds of coloration and must also share its
of activity with different partners. Hebrew dwells with, and is
by, Yiddish and Biblical Hebrew among the ultra-orthodox, Judeo-Arabic
among the Jews who emigrated from Arab countries, Russian among the
Jews, and Arabic among the Arabs. In addition, there is the impact of
among the privileged class. In each group, Hebrew is also granted a
meaning: It is the language of a new nation among the upper strata, a
vernacular for the ultra-orthodox, the language of traditional Judaism
for Israeli Jews from Arab countries, the language of a target society
for the Russian Jews, and a second language for the Arabs. Yet, it
that the generalized use of Hebrew still means a reference to a common
set of symbols and, thus, the possibility of significant communication.
What now keeps sociocultural groups together, as constituents of one
is that each group has retained its distinctiveness by selecting,
and forging its symbols not only in reference to itself, but also
confrontation with the dominant culture. These groups create
and inter-languages, conveying the imprint of both the dominant culture
and the original cultures. Even in this era of multiculturalism, the
culture is still a major ingredient of the cultural changes undergone
groups at the pace of their social insertion, when efforts to retain an
allegiance to legacies are concurrent with the efforts to acquire and
to new codes and symbols. This is the kind of glue that helps keep
those who share a setting but who, with their own hands, attach
to that settingâs fragmentation.
Among the Arabs in Israel: Sociolinguistic Aspects
This article investigates the main sociolinguistic aspects of the Hebrew language among the Arabs in Israel. The issues examined are: Hebrew knowledge and use, integration and diffusion of lexical items into Palestinian Arabic in Israel, Hebrew in the language landscape of the Arab villages and cities, and attitudes toward Hebrew.
The study shows that Hebrew is the main foreign source of linguistic innovation. If we compare the slow acculturation of Arab society to the culture, as relates to English, with to the fast and dynamic acculturation to the Israeli Jewish culture, we obtain insights into the processes of modernization: Hebrew is a major source of modernization for the Arab society in Israel.
The prestige of Hebrew is related to the progress of Israel in many domains. Many Arabs perceive Israel as a modern country with an advanced technology. To join this progress, many young Arabs learn Israeli patterns of behavior. Despite this, Arabs attach different values to the two languages, Arabic and Hebrew. Arabs are aware that Arabic is a beautiful, rich, and prestigious language; Israeli Arabs are also aware that the mastery of Hebrew is a means for achieving economic, educational, and social levels similar to those existing among Jews. This implies that Arabs learn Hebrew for practical and integrative reasons. This situation, in fact, reflects the nature of the relationship between the Palestinians and Jews in Israel from various perspectives. First, Israel is defined and perceived by Palestinians as a Jewish-Zionist state and not a country for all its citizens. Consequently, the Palestinians are seeking to enhance their national identity in the Jewish State; Arabic serves as an important component in this national identity. Second, the Arab-Israeli conflict has not contributed to a softening of the differences between the Arab minority and Jewish majority in Israel; in some cases, the conflict has strengthened the differences. Third, because Arabs and Jews live in separate locations within Israel, there has not been extensive contact between the two groups; this segregation has even helped preserve some distance betweenÊthe two peoples. All these factors have served to deny a social convergence at a higher level toward the dominant Jewish culture and its language (Hebrew) among the Arabs in Israel.
The dual identity (Palestinian and Israeli) is reflected in the
repertoire of the Palestinians in Israel. The tension between the two
has limited the degree of convergence to Hebrew, the language of the
culture. This means that the Arabs adopt the strategy of linguistic
(not assimilation). On the one hand, Israeli Arabs attempt, through
a high linguistic competence in Hebrew, to join the wide social network
shaped by the culture of the majority; on the other hand, they preserve
their identity through their mother tongue, Arabic.
Corpus of Spoken Palestinian Arabic (CoSPA)
The Corpus of Spoken Palestinian Arabic (CoSPA) is an ongoing project that is proceeding according to regions and that will eventually cover the whole linguistic area of Palestinian Arabic. The scope of CoSPA encompasses both Arabic dialectology and corpus linguistics.
First stage: Fieldwork in the Galilee (1996-98)
CoSPA grew out of a joint project of the University of Haifa/Israel (Rafi Talmon) and the University of Heidelberg/Germany (Otto Jastrow, Peter Behnstedt),(1) entitled ãA systematic Survey of the Arabic Dialects in Northern Israel.ä The project, which was founded by GIF,(2) was undertaken during the years 1996-1998. The project covered more than 120 Arabic-speaking localities in the Lower and Upper Galilee, and we obtained first-hand linguistic data by means of questionnaires and tape recordings. The total number of informants consulted (either interviewed or recorded or both) exceeds 700. The recordings comprise more than 400 cassettes, with a total of 200 hours of recorded speech.
In addition, data were collected for approximately 30 pre-1948 localities (villages no longer existing today) for which informants, who now reside in other villages in the area, could be found. Similarly, a special investigation was aimed at retrieving the original Arabic dialects spoken by Jewish communities prior to the establishment of the State of Israel. This was accomplished for Teverya (Tiberias) and Zefad (Safed), where older members of the local Jewish community were recorded and interviewed. The city of Haifa posed an even greater challenge, because prior to 1948, the Jewish, Christian, and Muslim communities of the city spoke specific Arabic dialects. For his doctoral dissertation, Aharon Geva-Kleinberger was able to find and record a number of elderly speakers of Arabic (who had been living in the city before 1948) from all three communities.
Second stage: Fieldwork in the Muthallath area (1999)
In a second stage, fieldwork was extended to cover the Arabic dialects of the so-called ãTriangleä (in Arabic Muthallath, in Hebrew Meshulash). This is an area that is situated south of the Galilee and that comprises approximately 25 localities, from Imm ilFahim in the north to Kufir Kasim in the south.
In 1999, the present writer completed two fieldwork campaigns, for a total of 11 weeks, in the Muthallath area. For the fieldwork in the Muthallath area, we developed a different type of questionnaire, which concentrates on the morphology of the verb. This is because the phonology offers less variation in the Muthallath area than in the Galilee, and the verb morphology is very complicated and contains some interesting points of divergence.
In this second stage, again, coverage was near complete. In each locality visited, I completed the questionnaire (an operation usually requiring the larger part of a day), and in about every third locality, I made also tape recordings of spontaneous speech.
Third stage: Fieldwork in the Central Israel area
The success of stages 1 and 2, as described above, encouraged us to envisage a continuation of our work, with the aim of covering all dialects of Palestinian Arabic spoken within the State of Israel and, ultimately, in the whole linguistic area of Palestine. As the next step, another joint German-Israeli project started in 2001, again funded by GIF. The project is planned for 3 years. The German participants are Otto Jastrow (University of Erlangen) and Werner Arnold (University of Heidelberg), the Israeli participants are Rafi Talmon and Aharon Geva-Kleinberger (University of Haifa) and Arye Levin and Simon Hopkins (The Hebrew University, Jerusalem). The area of investigation will include the Tel Aviv area (Yafo, Lod, and Ramle) and the Jerusalem area, including some nearby localities that are, at present, under Palestinian administration.
The future stages will concentrate on the areas of Palestinian Arabic outside the State of Israel, that is, the so-called ãWest Bankä, which is currently under the administration of the Palestinian Authority. To proceed with fieldwork on a large scale, some official agreement with the local administration will be necessary. On the individual level, however, fieldwork is already progressing.
Characteristics of CoSPA
As described above, the scope of CoSPA encompasses both Arabic dialectology and corpus linguistics. As a multi-purpose corpus,
- It serves the needs of dialectology, providing the basis for grammatical description, dialect geography (including maps), and a lexicon of the respective areas.
- It equally provides the basis for ethnographic and folkloristic studies and allows the publication of collections of oral literature.
- It can provide the basis for computer-based research along the lines of ãcorpus linguisticsä.
CoSIH and CoSPA: Points of convergence
There is a considerable overlap between CoSIH and CoSPA in terms of territory and population, because all speakers of Palestinian Arabic who are citizens of Israel are also members of the Hebrew-speaking population. There is continuous Hebrew interference in Palestinian Arabic speech and, on a smaller scale, Palestinian Arabic influence in spoken Hebrew. As CoSIH and CoSPA proceed, it will be possible, for the first time, to study the interrelation of the two languages on a sound empirical basis.
(1) Neither Jastrow, who has since become a professor in the University of Erlangen-Nürnberg, nor Behnstedt are affiliated with the University of Heidelberg. Werner Arnold, who is going to participate in Stage 3 of CoSPA, is affiliated with the University of Heidelberg.
(2) GIF (short for ãGerman-Israeli Foundationä) is an agency that promotes research undertaken jointly by German and Israeli scholars.
Monolingual to Multilingual? Educational Language Policy in Israel
Elana Shohamy and Bernard Spolsky
Although Israel is perceived as a monolingual country, the reality is that it is multilingual: The different groups residing in the country use a large number of languages. The paper describes the process and cost of the revival of spoken Hebrew that began at the end of the nineteenth century. This process continued in Palestine, where Hebrew was reinforced by a strong Zionist ideology that associated language with national unity and discouraged immigrants from maintaining home languages, even Jewish languages such as Yiddish and Ladino. The cost of the revival of Hebrew was the loss of the rich linguistic repertoire of the Jews, a traditional trilingual pattern, typical of the years in Diaspora: Hebrew and Aramaic for sacred and secular literacy; a Jewish language such as Yiddish or Ladino for oral functions within the community, and a territorial gentile language for external contacts. The monolingual Hebrew ideology was bolstered by myths and assumptions that encouraged a new language policy, based on the following assumptions: (a) Hebrew will be learned by immigrants only if all home languages are abandoned; (b) acquiring Hebrew will lead to national unity; and (c) knowledge of Hebrew is key to acculturation and integration, which will be slowed if immigrant languages and memories of Diaspora continued to exist. The result has been a monolingual society with strong dominance of Hebrew, loss of Jewish and other heritage languages, and general decrease of national language capacity.
In practice, monolingualism has been more of an ideology than a
reality, as a highly complex pattern of multilingualism has been
by groups living in Israel before independence, by the continuous waves
of immigration, and by the large number of foreign workers. Although
4,500,000 Israelis are estimated to have functional competence in
(there has been no language question on the census since 1983), other
are represented: about 1,400,000 speakers of Arabic, 800,000 of
200,000 of French, 215,000 of Yiddish, 250,000 of Rumanian, 100,000 of
Spanish, 60,000 of Hungarian, 60,000 of Persian, 60,000 of Amharic, and
50,000 of other languages of the former Soviet Union and other
used by the approximately 200,000 foreign workers (see Spolsky and
Languages of Israel: Policy, Ideology and Practice, 1999,Êfor a
discussion of the topic). These languages create pockets of resistance
to the monolingual ideology. Thus, Hebrew is now being challenged by
Arabic, Yiddish, and the languages of the foreign workers, and because
Hebrew is forced to compete with English in an increasing number of
As well as widespread plurilingualism, the beginning of respect for
is to be seen in signs of multilingual ideology. This was behind the
language in education policy for Israeli schools, adopted by the
of Education in 1996. This was the first official document stating that
each group, Jews and Arabs, should learn three languages and should be
encouraged to learn an additional home, community, or world language as
well. The multilingual policy is based on the assumption that Israel
of a large number of ethnolinguistic groups, that the languages of
and indigenous groups are assets and not liabilities, that language
should be additive rather than subtractive, that plurilingualism is
of multiculturalism, and that different languages are needed for
purposes. For Jews, the languages are Hebrew, Arabic, and English, plus
a heritage/community/world language; for Arabs, it is Arabic, Hebrew,
and additional languages. The policy encourages immigrants to maintain
home languages. The publication of the new language policy is
as it does not view other languages as a threat to Hebrew. However,
remain serious questions about implementation of the policy, as result
of complex pressures from a variety of directions. One is the lack of
to learn additional languages, especially in case of learning Arabic,
theÊpolitical conflict. Another is the reluctance to teach spoken
and there are some who fear that English in Israel poses a threat to
In addition, it is not clear that community efforts to maintain Russian
will succeed. In sum, the new policy offers a guide on how to develop a
successfully multilingual society, but there are no guarantees that the
policy will be implemented by governments that have more urgent
Variation Across Generations in Israel
Israeli Hebrew comprises roughly two broad linguistic variants: ãGeneralä (GH) and ãMizrahiä (MH). The former enjoys social prestige and is also the most widespread by the elite. It is, therefore, commonly spoken by a constantly growing part of the population. The ãMizrahiä variety reflects Hebrew as it was pronounced in Arabic-speaking countries. Immigrants from those countries form the lower ranks of Israeli society. Mizrahi Hebrew is socially marked. Linguistically, it is characterized, inter alia, by the full phonetic realization of the pharyngeal fricatives: the sociolinguistic variables (H)Ê[HET] and (Î) (ÎAYIN).
The emancipation of Mizrahi Jews in Israel is an ongoing social trend with cultural and political aspects. Culture-oriented Mizrahi leaders and intellectuals might be motivated by the will to gain recognition or esteem for the values of their ethnic identity, and those include the MH pronunciation, especially HET and ÎAYIN. On the other hand, young Mizrahi activists may find themselves in a conflicting situation. Their efforts to integrate with the leading elements in the society compel them to use the prestigious GH norm. The situation becomes a knotty one when ethno-cultural arguments are used for political purposes; thus, the political context, when devoid of ethnic concerns, promotes the use of GH, while the cultural or the ethnic contexts, either with or without a political bias, tend towards MH. Also, on some occasions, Mizrahi persons, who otherwise speak in the MH variety, may use intentionally the GH pronunciation in interactions with persons of their own community; in exhibiting linguistic manners of the influential elements of Israeli society, they seek social status.
From what is commonly known about MH and its users, the phoneme /H/ is considered as a sociolinguistic indicator, that is, as a variant that shows social or ethnic distribution. It might be interesting to regard it as a sociolinguistic marker, that is, one displaying stylistic distribution. I found that, in Romema, (H) was still largely an indicator among the elderly (males or females), among the majority of the younger males, and among females with poor educated background. However, it had begun to function as a marker among more educated young females and among young males involved in political activity.
I think that a sociolinguistic indicator becomes a marker via gradual change and is characterized by a growing awareness of its social values by part of the population, for example, younger females and politically active young males. That gradation from no variation at all (a status of ãindicatorä) to an evolving consciousness of the social value of a variant (a status of ãmarkerä), promotes the spread of variation and is more readily observable in small communities that are characterized by a close-knit social network.
As far as Romema is concerned, sex, age, socio-political
and outward connections are most significant in the linguistic choices
made by the speakers. Although systematic research has not yet been
in Israel on that specific subject, one cannot help noticing that the
behavior of children is often very dissimilar to their parentsâ
behavior, mostly as the result of different social routes at adult ages
and of ties with peer groups in the adolescent phase of life. In
a corpus, I believe we should insist on gathering data not only from
but also from parents and close kin, so that the relevance of family
to linguistic variation and change can be examined.
Varieties in Contemporary Hebrew
Ora (Rodrigue) Schwarzwald
The paper is an overall survey both of the linguistic varieties found in Modern Hebrew in Israel and an update review of the research on each linguistic variety. The classification of the linguistic varieties is based primarily on text typology, as described by external and internal criteria by Sinclair, Ball and their team (EAGLES96), in which text refers to any linguistic performance, oral or written. The external criteria for text typology include (a) origin, (b) state, and (c) aim of a given text. The internal criteria for text typology include (a) topic and (b) style. The external criteria determine the internal type of text.
The most significant factor in determining language varieties in Modern Hebrew is the text origin, namely, the speakersâ features, their locations, times of speech production, and the roles of the participants. The speakersâ features include the following types: (1) ethnic origin, (2) socioeconomic status, (3) gender, (4) religiosity, and (5) age. The state of the text refers to whether it is oral or written. Oral texts vary according to the speech pace and cautiousness.
Written texts range from formal to informal, literary to non-literary writings, printed and non-printed, and so on. The aim of the text refers both to the target audience of the text and to the producerâs intentions in producing the text.
The internal text criteria are interwoven among the external
Therefore, the most significant varieties are content oriented,
on the subject matter, ideological attitudes, and social desirability.
The style varies between formal and informal types.
The research survey reveals that the ethnic varieties of speech were studied in the 1950s, and most of the research done in other sociolinguistic areas of
Modern Hebrew started in the 1970s. Research in gender and religion linguistic varieties is the most recent research. Based on pure linguistic observations, the paper also raises many issues of language varieties that ought to be further studied scientifically.
Law, and Social Justice
The various areas in which language and the law overlap have, in recent years, attracted the attention of researchers from different disciplines. The common areas in which the linguist can contribute actively and substantially to the justice system are gradually becoming more clear. Linguistic tools may serve as aids in the legal decision-making process in the courts or in judicial review. This paper will shed light on those points that may be especially important and productive in this context.
Legal register, social justice, and the demand for the use of
The linguistic debate surrounding the legal-judicial register need not be restricted to a mere description. This area of discussion can be expanded to include the unique difficulties that the legal register presents to the public and can even suggest solutions to this problem.
The unique properties of the legal register make legal texts difficult for the uninitiated to understand. On the assumption that the rights of citizens (as well as those of residents or tourists) are at least partially dependent on their ability to comprehend the texts in which these rights are described, and on the assumption that involvement in a legal process of any kind is always contingent on contact with spoken and/or written texts, it is understood that our interest here is with issues related to inferiority or inequality before the law.
Non-native speakers are likely to require interpretation during any contact they may have with the legal system. However, the courts tend to keep the use of professional interpreters to a minimum, calling upon them only when left with no other alternative when conducting a trial. An interrogation conducted in a language in which the defendant is not fluent has far-reaching implications for the manner in which the testimony is received. Despite this, Israeli law does not go into sufficient detail on the question of when a defendant ãdoes not know Hebrewä and who is qualified to interpret for the defendant in such a case.
The legal register may be difficult for native speakers to understand as well. Recognition of this triggered the demand to simplify legal language a number of decades ago. This recognition also led to the establishment of movements promoting the use of Plain Language in legal texts in many places in the world. In Israel, such voices are not being heard. In my view, the possibility should not be ruled out that ÷ knowingly or unknowingly ÷ the legal community has perpetuated its public and economic standing by means of a linguistic style that is partially or wholly incomprehensible to large sections of the public.
ãLegal linguisticsä: Linguists as expert
witnesses in court
In many Western countries, there has been a tendency, in recent years, to summon linguists to testify as expert witnesses on linguistic matters relevant to the court debate. Most cases involve two types of testimonies ÷ those related to the ability of a particular addresser to produce a certain text (e.g., in matters related to the determination of the authenticity of a confession), and those related to the ability of a particular addressee to understand a certain spoken or written text.
The opinion provided by an expert linguist may also aid in legal interpretation. To illustrate the possible contribution of linguistic expertise in this matter, this paper discusses a case brought before an Israeli court, in which a considerable portion of the debate focused on the meaning of the word ãviolenceä and on theÊquestion of whether ãverbal violenceä is indeed a form of violence. A linguistic analysis proves that the legal debate, described by the judge as ãlanguage-linguisticä, does not always conform to the analysis as seen through linguistic eyes and may even lead to an opposite conclusion.
Linguists can place themselves at the disposal of the judicial system as experts, show the places where linguistic expertise is necessary, and propose linguistic tools from various fields, such as second-language acquisition or reading comprehension, to aid in situations related to the judicial mechanism.
Rhetoric and the legal discourse: Form versus content
The debate on rhetoric in the legal context can be a critical one, for example, in the analysis of testimonies and the designation of the situations in which justice may not necessarily have been served. Testimonies given in court may often be alternative versions of the ãtruthä, and the judge must decide ãwhom to believeä. This choice may be affected not only by the content of the testimonies, but also by their form or even phrasing. False testimony presented articulately, fluently, and confidently may make a better impression than a truthful one presented haltingly, irresolutely, and timorously.
Hebrew and Its Classical Background
This article investigates the history of modern written and
Hebrew. We may consider the middle of the eighteenth century, when
Mendelsson published The Kohelet Musar journal (c. 1755), the
point of modern written Hebrew. The arrival of Ben-Yehudah in Eretz
in 1881 may be taken as the starting point of spoken modern Hebrew.
modern Hebrew maintains the historical continuity of the Hebrew
Most of the lexical and grammatical elements incorporated into modern
have been drawn from the Hebrew of ancient periods.
Nevertheless, it is clear that many syntactic neologisms have been derived from earlier strata. In other words, modern Hebrew makes diverse and innovative usage of the elements it adopts from classical Hebrew, which I demonstrate through two grammatical patterns.
(1) The pÎalÎal pattern
This pattern is found in classical Hebrew (Biblical Hebrew and Mishnaic Hebrew), for example: yraqraq, âadamdam. Even in ancient times it was not clear whether âadamdam meant red, very red, or slightly red. Indeed, sources from the tannaitic period provide evidence of all three opinions. In modern Hebrew, it was decided to adopt the pÎalÎal pattern to indicate the diminutive. Hence, klablab denotes a small dog, and qtantan means very small.
(2) ThepaÎil pattern
In classical Hebrew, we find paÎil, which functions as a verbal noun or indicates a season of the year, for instance, Harish (the activity of the plower or the plowing season). We also find paÎil, which is the passive participle of the paÎal pattern, that is to say, a variant of paÎul: qariâ like shaliaH, qaruâ like shaluaH are passive participle forms. However, in modern Hebrew, the forms âakhil, qariâ and so forth, serve as adjectives of the potentiality. Therefore, qariâ means readable, and âakhil means eatable. This innovation was apparently inspired by the modern European language, which creates this kind of adjective. In summary, we can determine that written and spoken modern Hebrew often adopts classical elements from the rich tradition of classical (Biblical and Mishnaic) Hebrew and make new usage of them. What I present here is but a brief demonstration of this.
of Spoken Israeli Hebrew
Formerly used mainly as a literary and liturgical language, Hebrew was transformed at the turn of the twentieth century into a full-fledged, vernacular language and the national language of the Jews in Israel. The term ãrevivalä for the emergence of Hebrew in Palestine cannot be justified. Hebrew never died, and the language served for almost two millennia, not only as a written medium, but also for oral communication between Jews wherever necessary.
The written language was never frozen or rigid. It was constantly changing, and influences from internal Hebrew strata (especially from local vernaculars ÷ languages spoken by Jews wherever Hebrew was written) have always made their impact on Hebrew. Regressions to more ãoriginalä or ãpureä Hebrew, as was the case during the time of the Enlightenment, are rare and must be regarded as partial.
The course taken by written Hebrew toward its use in Palestine and Israel as the main medium of written communication must be regarded as sequential and gradual to various degrees. This is obviously not the case with the emergence of spoken Hebrew. When massive waves of immigration from many countries began to reach Palestine, a multilingual society with a pressing need for immediate communication began forming. It was at this point that the transformation of Hebrew took place most rapidly.
The result of both processes, the sequential development of written Hebrew and the rapid regeneration of spoken Hebrew, is the contemporary linguistic continuum of the Hebrew language, as it is used in Israel. From the linguistic point of view, Modern Hebrew, as compared with earlier stages of Hebrew, has a significantly different structure.
Shaped mainly under the influence of European strata at the beginning of the twentieth century, one may compare the emergence of Israeli Hebrew to the emergence of Creole languages. This paper describes similarities and dissimilarities between the emergence of Hebrew and the creolization process and tries to show some advantages in setting the recent history of Hebrew under this alternative perspective.
To understand the active processes within the creation of Modern
Hebrew, research should take the following directions:
1. Search for evidence of spoken Hebrew in the early years of its emergence,
2. Search for evidence of prior creolization in data from contemporary Israeli Hebrew, and
3. A reevaluation of the sociolinguistic and, especially, previous linguistic research on the emergence of twentieth-century Hebrew.
All aspects raised in this paper or implied from its subject matter must and can be investigated: the extent to which contemporary Hebrew is different from all previous layers of the language; the gap between spoken and written varieties of Hebrew; the interrelationship among the continua of varieties within each and the impact of any of the existing varieties on each other; the history of Modern Hebrew on both its written and spoken continua; how Hebrew fits into the larger continuum of contact-induced languages; and how the emergence of Hebrew is related to creolization. All of these issues and many other related and unrelated questions must and can be investigated. However, investigations such as these can occur only where data exist. Without the compilation of an Israeli Hebrew corpus, none of these issues will emerge from obscurity. In fact, none of these topics can be investigated at all, unless we have at our disposal a corpus of Israeli Hebrew, both written and spoken.
An earlier, English version of this paper can be found in http://www.tau.ac.il/humanities/semitic/emergence.html.
and Morphological Variation in Spoken Hebrew
The article starts with a discussion of the some of the processes responsible for phonetic variation in Israeli