DJ Concordance
History of the El Colegio Index Project

John Page, Designer for the Colegio de México Digital Indexes Project

Beginnings

The Dzwo Jwan Index project goes back to a conversation between Sinologist John Page, Professor of Chinese Literature at El Colegio de México, and artificial intelligence expert Michael Gerszo. Michael had a project to inventory all the colonial buildings surviving in the old center of Mexico City for the municipal government. The columns and lines representing streets and avenues reminded John of his long-standing interest in the Dzwo Jwan, and his long-standing frustration with the inadequacy and difficulty of the Fraser-Lockhart index as an aid to students reading that text. He asked Michael whether the columns and lines could be equated with Fraser entries and references, and then linked to the DJ text and to Fraser's English meanings in a database. Michael replied that it was quite possible, despite the lack of an optical scanner for Chinese text.

Data entry for the Index was at first done with slashes representing characters. These were later replaced with standard or improvised Big5 codes by two students, Rosa Elena Moncayo and Francisco Hayo, who also digitized the Rwan Ywaen text that accompanies Legge's translation. Funding for the enterprise was provided by CONACYT (Consejo Nacional de Ciencia y Tecnologia), Mexico's National Council for Science and Technology, whose support continues at the present time. When Michael Gerszo left Mexico (he and Rosa Elena Moncayo married in 2001), Isabel García Hidalgo took charge of the computer side of the project, and put the entire index on line.

María Isabel García Hidalgo, Programmer for the Colegio de México Digital Indexes Project

Nature

The problem as it presented itself to the team at the outset was a Dzwo Jwan text of more than 179,500 graphs, condensing to a net lexicon of 3,789 graphs, of which 1,214 register only one meaning each. The remainder share approximately 15,491 meanings, including titles and personal or place names. The Fraser Index to the Dzwo Jwan, as completed by Lockhart, omits 242 of the Dzwo Jwan characters, but its chief defect as an access tool is that it omits something on the order of 100,000 occurrences of characters within the Dzwo Jwan lexicon. It is these "nulls" in the Dzwo Jwan text, the characters which are not linked to a specific Fraser entry and meaning, which became the agenda for the second phase of the project.

Technically, the Index consists of a relational database structured using Microsoft SQL Server in which the Index functions for query, retrieval or update of the database have been programmed and a series of Web pages programmed in HTML and Java-Scripts in which dynamic information retrieved from the database is incorporated by means of the Microsoft Internet Information Service component called Internet Database Connector. At present, Chinese characters are codified in the database server using BIG5 code and their representation in the client computer depends on its platform and the Internet browser employed by the user who must indicate that BIG5 is the coding used. Conversion to UNICODE will be made in the very near future. The text may be accessed by Dzwo Jwan year, Chinese character, or English meaning. It amounts to an extension and revision, a completion, and a full Internet implementation, of the 1930 Fraser Index.

Campus of El Colegio de México

Collaboration with the Warring States Project

The Warring States Project formally associated itself with the Dzwo Jwan Index effort in July 2001, under the rubric of Chinese Digital Indexes / Indices Chinos Digitalizados, in an agreement signed by the President of El Colegio de México. The Project served as a recruiting and coordination point for advanced students and other specialists in the US and elsewhere, who are contributed to the Dzwo Jwan linking process. All indexers were linked by a dedicated E-mail network, where problems of meaning and interpretation could be quickly resolved. The linking portion of the project was completed on schedule in early 2005.

Those who actively participated in the Dzwo Jwan Index linking phase were:

Lederle Research Tower, University of Massachusetts at Amherst

Advantages

The Dzwo Jwan Index project began as a tool for students beginning to read the Dzwo Jwan as part of intermediate level classical Chinese language study. The index quickly assumed the proportions of a major scholarly tool. One of its advantages over previous concordances or indexes to the Dzwo Jwan is its thesaurus feature, which allows characters to be selected for study by their meanings, expressed as English equivalents. Idea searches, and not merely character searches, thus become possible. This generalized subject index feature is indispensable for those doing thematic research in the text. In these advanced features, in its carefully considered display (with the selected character highlighted in red in each of its contexts), and in its universal availability to the world scholarly community, it ushers in a new era and a new level of sophistication in Chinese reference tools.

Back to DJ Concordance Page

2 Sept 2005 / Contact The Project / Exit to Reference Page