Skip to main content
Gila Prebor

Gila Prebor

  • noneedit
  • Gila Prebor is a lecturer in the Department of Information Science, Bar-Ilan University, Ramat Gan, Israel. Her main ... moreedit
ABSTRACTThis poster explores the potential of using technological tools, specifically the Transkribus platform, for the transcription of Hebrew manuscripts. The digitization of historical resources has made them accessible, but the... more
ABSTRACTThis poster explores the potential of using technological tools, specifically the Transkribus platform, for the transcription of Hebrew manuscripts. The digitization of historical resources has made them accessible, but the textual content of the scanned images remains inaccessible. Transkribus, an AI‐powered platform, offers tools for text recognition, transcription, and search of historical documents. The poster discusses the process of automatic text recognition (ATR) and the challenges it faces, particularly in handling handwritten texts and Hebrew letters. It provides an overview of the Transkribus platform, its functionalities, and the training process for creating transcription models. The author presents a case study of transcribing a 15th‐century Sephardic semi‐cursive Hebrew manuscript using the Transkribus platform and evaluates the performance of different models. The poster concludes by discussing the implications and possibilities of using Transkribus for autom...
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works... more
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works (title, author) and related notes. The hierarchy for the dataset is composed of "footprints" which are found in "copies" of "imprints" of "literary works," with "footprint being the most specific data point and "literary work" being the most general. Wherever possible, the standard identifier for imprints is the Bibliography of the Hebrew Book. Downloaded on October 1, 2021.
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works... more
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works (title, author) and related notes. The hierarchy for the dataset is composed of "footprints" which are found in "copies" of "imprints" of "literary works," with "footprint being the most specific data point and "literary work" being the most general. Wherever possible, the standard identifier for imprints is the Bibliography of the Hebrew Book. Downloaded on June 1, 2021.
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works... more
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works (title, author) and related notes. The hierarchy for the dataset is composed of "footprints" which are found in "copies" of "imprints" of "literary works," with "footprint being the most specific data point and "literary work" being the most general. Wherever possible, the standard identifier for imprints is the Bibliography of the Hebrew Book. Downloaded on April 1, 2021.
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works... more
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works (title, author) and related notes. The hierarchy for the dataset is composed of "footprints" which are found in "copies" of "imprints" of "literary works," with "footprint being the most specific data point and "literary work" being the most general. Wherever possible, the standard identifier for imprints is the Bibliography of the Hebrew Book. Downloaded on May 1, 2021
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works... more
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works (title, author) and related notes. The hierarchy for the dataset is composed of "footprints" which are found in "copies" of "imprints" of "literary works," with "footprint being the most specific data point and "literary work" being the most general. Wherever possible, the standard identifier for imprints is the Bibliography of the Hebrew Book. Downloaded on May 1, 2021
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works... more
This dataset comes from Footprints: Jewish Books Through Time and Place. Data includes provenance data (names, dates, locations) related to editions (title, publication date, publication place, standard identifier) of specific works (title, author) and related notes. The hierarchy for the dataset is composed of "footprints" which are found in "copies" of "imprints" of "literary works," with "footprint being the most specific data point and "literary work" being the most general. Wherever possible, the standard identifier for imprints is the Bibliography of the Hebrew Book. Downloaded on July 27, 2018
AbstractThe article describes a collation tool named ‘Juxta Commons.’ This is an open-source tool for comparing and collating multiple witnesses of a single textual work that is successful in working with Hebrew texts. In order to... more
AbstractThe article describes a collation tool named ‘Juxta Commons.’ This is an open-source tool for comparing and collating multiple witnesses of a single textual work that is successful in working with Hebrew texts. In order to demonstrate the potential use of Juxta Commons, two examples of comparisons of Hebrew texts were selected.
... Esther Lapon-Kandelshein • Gila Prebor Received: 17 February 2011 Ó Akadémiai Kiadó, Budapest, Hungary 2011 ... G. Prebor (&) Department of Information Science, Bar-Ilan University, Ramat-Gan 52900, Israel e-mail:... more
... Esther Lapon-Kandelshein • Gila Prebor Received: 17 February 2011 Ó Akadémiai Kiadó, Budapest, Hungary 2011 ... G. Prebor (&) Department of Information Science, Bar-Ilan University, Ramat-Gan 52900, Israel e-mail: Gila.Prebor@biu.ac.il 123 ...
This article investigates the forms of classification and indexing found in yeshiva libraries in the State of Israel. The yeshiva (plural: yeshivot) is a Jewish educational institution that focuses on the study of traditional religious... more
This article investigates the forms of classification and indexing found in yeshiva libraries in the State of Israel. The yeshiva (plural: yeshivot) is a Jewish educational institution that focuses on the study of traditional religious texts, primarily the Talmud and the Bible. The research goal was to analyze classification and indexing systems in these libraries, examine how they evolve, and to compare the yeshiva classification systems used in practice to Jewish studies classification in other classification systems. This study can help us understand how classification systems develop and what the cognitive, philosophical, and administrative processes that lie behind them are.
The relationship of F.M Dostoevsky with Jews attracted the attention of numerous scholars throughout the years, many of whom attempted to grapple with the views of the great writer and their origin. In this article we will attempt to show... more
The relationship of F.M Dostoevsky with Jews attracted the attention of numerous scholars throughout the years, many of whom attempted to grapple with the views of the great writer and their origin. In this article we will attempt to show this relationship by analyzing six of Dostoevsky’s greatest novels, written through the entirety of his career. We are analyzing these novels using Distant Reading in conjunction with Close Reading, tools that are commonly used in the field of digital humanities, which enabled us to show visually the extent of F.M. Dostoevsky’s engagement with this topic. The study poses two research questions: 1. To what extent did the writer use the more denigrating term “Zhid”? 2. Can we see a correlation between the writer’s portrayal of Jews with the definition of Anti-Semitism as it was known during his era? The obtained results show that there is clearly a correlation between the definition of anti-Semitism as it was understood at the time of Dostoevsky and ...
The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large‐scale analysis. Because... more
The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large‐scale analysis. Because proverbs can be described and indexed by multiple characteristics and criteria, we built a multidimensional ontology suitable for proverb classification. To evaluate the effectiveness of the constructed ontology for improving search and retrieval of proverbs, a large‐scale user experiment was arranged with 70 users who were asked to search a proverb repository using ontology‐based and free‐text search interfaces. The comparative analysis of the results shows that the use of this ontology helped to substantially improve the search recall, precision, user satisfaction, and efficiency and to minimize user effort during the search process. A practical contribution of this work is an automated web‐based proverb search and retrieval system which incorpora...
In this study, we present the first large-scale quantitative analysis of a corpus of censored historical Hebrew manuscripts that have survived through the ages. A new multi-dimensional ontology-based approach was applied to explore the... more
In this study, we present the first large-scale quantitative analysis of a corpus of censored historical Hebrew manuscripts that have survived through the ages. A new multi-dimensional ontology-based approach was applied to explore the geographic, temporal, actorand subjectbased distribution of censorship events. We adopted an ontology-based approach to apply statistical analysis on the metadata of censored Hebrew manuscripts for estimating the scope and quantifying the extent of the known facts on the censorship activity and its various characteristics over the years. In addition, we revealed some previously unknown phenomena and trends. Particularly, we analysed the relationship of censorship on other types of events in manuscripts’ lifecycle and compared the distribution of censored vs. non-censored manuscripts in different dimensions. We also devised a set of rules to complete the missing locations of over 50% of censorship events, which has substantially changed the big picture...
PurposeThe purpose of this study is to examine how different feminist Facebook groups in Israel operate in order to better understand the main issues in their discussions about feminism in Israel. The study will also identify the... more
PurposeThe purpose of this study is to examine how different feminist Facebook groups in Israel operate in order to better understand the main issues in their discussions about feminism in Israel. The study will also identify the variances between the different subgroups. A secondary research question examined was whether Voyant Tools can be used as an effective content text analysis tool in general and in Hebrew in particular.Design/methodology/approachThe study's research method analyzes the content of Facebook posts using the Voyant Tools online toolkit to quantitatively analyze and visualize the results of text mining and data visualization. The sample consists of the texts of posts of three groups representing different currents in Israeli feminism, gathered over a period of three months.FindingsThe results show that there are high-frequency words occurring in all groups, each group has its unique words, which distinguish it from the other groups. Feminist and Halachic Femi...
Traditionally, library catalogues have served as a tool to manage library collections and as a bibliographic tool for information retrieval. Eventually this caused library catalogues to be data silos. In order to break down these metadata... more
Traditionally, library catalogues have served as a tool to manage library collections and as a bibliographic tool for information retrieval. Eventually this caused library catalogues to be data silos. In order to break down these metadata silos, the information must be accessible and free to use. The semantic web, and in particular, linked open data, are initiatives that can turn library catalogs into a real part of the Internet. Today libraries are an important player in the linked data arena. Converting catalogues to large linked data enables large-scale analysis of cultural heritage Big Data. By implementing linked data initiatives open library data is available for reuse in the information space. Libraries can share their open metadata with non-library communities. Wikidata is a collaboratively edited knowledge base hosted by the Wikimedia Foundation. It is one central database of human knowledge which contains structured and linked data. If more collections will be added to thi...
In this article, we utilized large-scale statistical analysis and data visualization techniques of the greatest collection in the world of Hebrew manuscript metadata records to develop a new methodology for quantitative investigation of... more
In this article, we utilized large-scale statistical analysis and data visualization techniques of the greatest collection in the world of Hebrew manuscript metadata records to develop a new methodology for quantitative investigation of the palaeographic, geographic, and temporal characteristics of historical manuscripts. The study aims to explore whether and to what extent the script type of the manuscript and its changes over time can be used to automatically predict and complete missing geospatial data of the manuscripts. To this end, various ontological entities were used as features to train supervised machine-learning algorithms to predict the places of writing of manuscripts which were often absent in the catalogue records. The obtained results show that while the script type as an only feature might not be sufficient for prediction of the location of the manuscript’s writing, its combination with temporal data of the manuscript yielded about 80% accuracy. Eventually, our sys...
The purpose of this study is to examine the extent to which parents of children aged 10–12 are aware that cyberbullying is a widespread phenomenon, how they deal with acts of cyberbullying performed by or toward their child, whether they... more
The purpose of this study is to examine the extent to which parents of children aged 10–12 are aware that cyberbullying is a widespread phenomenon, how they deal with acts of cyberbullying performed by or toward their child, whether they take active steps toward preventing cyberbullying by and/or toward their child, and to what extent they are willing to invade their child’s privacy to this end.The study employs a quantitative methodology. One hundred and thirty-three parents were selected from a convenient sample of parents of children in grades 4–6 in a number of public elementary schools. It was found that most parents have heard about cyberbullying, mainly through the various media and not as a result of communicating with their child. Although parents understand that there are psychological effects on victims and criminal consequences for aggressors, most do not deepen their knowledge on the issue. Most parents assume that they can control the phenomenon and distance their chil...
In this study, we present the first large‐scale quantitative analysis of the full corpus of censored historical Hebrew manuscripts that survived through the centuries. A new multi‐dimensional data‐driven approach was applied to explore... more
In this study, we present the first large‐scale quantitative analysis of the full corpus of censored historical Hebrew manuscripts that survived through the centuries. A new multi‐dimensional data‐driven approach was applied to explore the influence of censorship on the creation of new manuscripts by Hebrew writers in Italy during the 16th‐ 18th centuries. Our findings demonstrate that there was a substantial decrease in creation of new manuscripts in the periods of high censorship activity.
In this study we proposed and implemented a new research framework based on the data‐driven approach that aims to quantitatively analyse the largest catalogue in the world of Hebrew manuscripts' metadata records. We used an... more
In this study we proposed and implemented a new research framework based on the data‐driven approach that aims to quantitatively analyse the largest catalogue in the world of Hebrew manuscripts' metadata records. We used an event‐based ontology model which captures the information on the main types and relationships of manuscript metadata. Then, we utilized the ontology‐based inference to automatically complete and expand the data on Hebrew manuscripts recorded in the catalogue and further mine and analyse this data at the large scale. As part of the quantitative analysis we applied statistical and data visualisation techniques in order to compare the effect of time and place on the manuscripts' characteristics and distribution, to discover previously unknown distant relationships between entities and events, to reveal global phenomena and trends, and better appreciate their impact on the Hebrew writing and its different manifestations.
Footprints traces the history and movement of Jewish books since the inception of print. The history of the book is an important part of humanities scholarship. Especially as more books are digitized, scholars, librarians, collectors, and... more
Footprints traces the history and movement of Jewish books since the inception of print. The history of the book is an important part of humanities scholarship. Especially as more books are digitized, scholars, librarians, collectors, and others have become increasingly attuned to the significance of individual books as objects with their own unique story. Jewish books in particular tell a fascinating story about the spread of knowledge and faith in a global Diaspora. Every literary work represents a moment in time and space where an idea was conceived and documented. But the history of a book continues long after composition as it is bought, sold, shared, read, confiscated, stored, or even discarded. This history is the essence of Footprints.
Research Interests:
In this research we devised and implemented a semi-automatic approach for building a SageBook–a cross-generational social network of the Jewish sages from the Rabbinic literature. The proposed methodology is based on a shallow... more
In this research we devised and implemented a semi-automatic approach for building a SageBook–a cross-generational social network of the Jewish sages from the Rabbinic literature. The proposed methodology is based on a shallow argumentation analysis leading to detection of lexical–syntactic patterns which represent different relationships between the sages in the text. The method was successfully applied and evaluated on the corpus of the Mishna, the first written work of the Rabbinic Literature which provides the foundation to the Jewish law development. The constructed prosopographical database and the network generated from its data enable a large-scale quantitative analysis of the sages and their related data, and therefore might contribute to the research of the Talmudic literature and evolution of the Jewish thought throughout the two last millennia.
In this study, we present the first large-scale quantitative analysis of a corpus of censored historical Hebrew manuscripts that have survived through the ages. A new multi-dimensional ontology-based approach was applied to explore the... more
In this study, we present the first large-scale quantitative analysis of a corpus of censored historical Hebrew manuscripts that have survived through the ages. A new multi-dimensional ontology-based approach was applied to explore the geographic, temporal, actor-and subject-based distribution of censorship events. We adopted an ontology-based approach to apply statistical analysis on the metadata of censored Hebrew manuscripts for estimating the scope and quantifying the extent of the known facts on the censorship activity and its various characteristics over the years. In addition, we revealed some previously unknown phenomena and trends. Particularly, we analysed the relationship of censorship on other types of events in manuscripts' lifecycle and compared the distribution of censored vs. non-censored manuscripts in different dimensions. We also devised a set of rules to complete the missing locations of over 50% of censorship events, which has substantially changed the big picture of spatial distribution of censorship activity. From the temporal perspective our findings demonstrate that censorship was conducted in "waves" and there was a decrease in the creation of new manuscripts in periods of high censorship activity. Certain subjects, such as Kabbalah and Philosophy were censored significantly more than others, and the locations and script types' distribution in censored manuscripts differs from the non-censored manuscripts.
In this research we devised and implemented a semi-automatic approach for building a SageBook-a cross-generational social network of the Jewish sages from the Rabbinic literature. The proposed methodology is based on a shallow... more
In this research we devised and implemented a semi-automatic approach for building a SageBook-a cross-generational social network of the Jewish sages from the Rabbinic literature. The proposed methodology is based on a shallow argumentation analysis leading to detection of lexical-syntactic patterns which represent different relationships between the sages in the text. The method was successfully applied and evaluated on the corpus of the Mishna, the first written work of the Rabbinic Literature which provides the foundation to the Jewish law development. The constructed prosopographical database and the network generated from its data enable a large-scale quantitative analysis of the sages and their related data, and therefore might contribute to the research of the Talmudic literature and evolution of the Jewish thought throughout the two last millennia.
ABSTRACT The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large-scale analysis.... more
ABSTRACT The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large-scale analysis. Because proverbs can be described and indexed by multiple characteristics and criteria, we built a multidimensional ontology suitable for proverb classification. To evaluate the effectiveness of the constructed ontology for improving search and retrieval of proverbs, a large-scale user experiment was arranged with 70 users who were asked to search a proverb repository using ontology-based and free-text search interfaces. The comparative analysis of the results shows that the use of this ontology helped to substantially improve the search recall, precision, user satisfaction, and efficiency and to minimize user effort during the search process. A practical contribution of this work is an automated web-based proverb search and retrieval system which incorporates the proposed ontological scheme and an initial corpus of ontology-based annotated proverbs.
... all studies that received the subject classification of “library science” “information science” or both during those years in the ProQuest digital dissertations system were retrieved yet only studies conducted in departments of... more
... all studies that received the subject classification of “library science” “information science” or both during those years in the ProQuest digital dissertations system were retrieved yet only studies conducted in departments of information science were included in the sample. ...
Historical handwritten Hebrew manuscripts are one of the most unique and authentic witnesses of Jewish culture and thought that survived through the centuries. In order to enable a systematic research of the knowledge embedded in the... more
Historical handwritten Hebrew manuscripts are one of the most unique and authentic witnesses of Jewish culture and thought that survived through the centuries. In order to enable a systematic research of the knowledge embedded in the manuscripts, there is a need for a formal conceptual data model with a high level of semantic granularity, an ontology. We propose to build a dynamic web-based framework that will allow scholars to create, enrich, and consult an “ontopedia” (ontology-based encyclopedia) of Hebrew manuscripts. The framework is based on an ontology especially designed and implemented for this domain and goals. We view a manuscript as a “living entity” and propose to design a new ontological data model of the narrative for a manuscript, stages/milestones in its biography (creation, copying, and acquisition). A sequence of events and places constitutes a timeline of history against which manuscripts, people, and their relationships can be placed. A large-scale automated reasoning based on the ontology will also enable us to construct a semantically rich social network of people and manuscripts, and to compare the effect of time and place on the manuscripts’ qualitative characteristics and quantitative distribution.
The study presents the state of bibliographical research in the discipline of Hebrew printing during a 30-year period, ranging from the latter quarter of the 20th century until the beginning of the third millennium (1976-2006). Through... more
The study presents the state of bibliographical research in the discipline of Hebrew printing during a 30-year period, ranging from the latter quarter of the 20th century until the beginning of the third millennium (1976-2006). Through bibliographical parameters it characterizes the publications dealing with Hebrew printing, examines whether the published material exhibits laws and systematic regularities that are consistent with Bibliometrics, and describes directions in which the field has developed.

And 2 more