AmLaw Tech (Spring, 1998)
The X-Files Are Coming by Ethan Katsh My ten year old son was recently given an assignment by his teacher to write a report on Marco Polo. After looking at a few library books and consulting an encyclopedia, he asked me if we might "look on the Web" for some information. This did not seem like an altogether bad idea so we accessed one of the search engines, typed in Marco Polo, and were rewarded with links to over a thousand sites in which Marco Polo was mentioned. My son was overjoyed that we had found so much. As we looked at the first few sites, however, I realized quickly that our success was largely illusory. Most of the sites we found, for example, were Chinese restaurants named Marco Polo. Many of the others were stores, cruise ships, hotels, and other places that had nothing to do with the Marco Polo we were hunting for. My son was persistent, however, and we continued to look through screen after screen. And finally, we did find a link to a research paper written about Marco Polo. We clicked on the link and, indeed, a very nice looking document, with both text and images, appeared on the screen. My son again was elated, thinking that he had found the perfect source. I soon realized, however, that what we had found was a very nice Web site put up by a fifth grade teacher in Michigan. This teacher had given her class the same assignment that my son's teacher had given his class and then had placed all the student papers on the Web. I tried to explain to my son that this other ten year old's report was not really the kind of reference material his teacher had in mind. I told him that quoting from this paper was not much different from quoting from the paper of another fifth grader in his own school. This line of reasoning, however, was useless. The document was presented on screen so attractively and authoritatively that in my son's mind "who could ask for anything more?" I accepted reality, therefore, and took the easy way out. I decided to let his teacher bear the burden of explaining why another fifth grader's paper was not the ideal source or reference material for him Most of us are aware that the online world does not provide us with many traditional cues for assessing the quality of information. On the Web, the document of a ten year old may look little different than the document of an eminent scholar or a distinguished publisher. We, and I include myself here among the millions who have put up Web sites, have been seduced by a system that prizes appearance and quick retrieval over great distances more than anything else. We have been seduced because it is indeed a miraculous system. Could we ask for anything more? One reason the Web has grown so fast is that you can use it without knowing anything about HTML, the language with which Web files are created. Users do not have to understand HTML because our software, browsers such as Netscape or Internet Explorer, does that for us. The HTML language tells the browser whether text should be large or small, where an image should be, how long a sound file should play for. HTML is so hidden from users that it is easy to forget that it is a language with weaknesses as well as strengths. Its strength is that it is a language that tells the browser a great deal about style, display and presentation. It allows properly formatted content from any machine anywhere to be displayed on your screen. What it does not do is communicate anything about content of a displayed document to the browser or to your machine. The reason we search for words or combinations of words to find Web documents is that this is the only way, however imperfect it may be, to obtain any information about a document. Smart machines should process data as well as communicate it. What new opportunities and ventures might be supported, therefore, if a browser could "know" something about the content of the information it is displaying? What if there were a language that could, using data provided by Web site creators, allow browsers and machines to identify content much more accurately and clearly than is now possible? This is where the X-files come in. These X-files are not X-rated files and they do not have any connection with the television program. Rather, these are XML files, or, more specifically, files which have the extension of .xml just as most World Wide Web files have the extension .html. Just as html files inform the browser about style and display, XML files tell the browser about content. For example, in addition to knowing that some piece of text should appear in a large font on a blue background, the browser would also know that the highlighted text represented the price of an object. If it knew this, it could process the information in some way. Clearly, we have impressive capabilities for displaying content on screen and for acquiring whole documents from anywhere anytime. But what if we had not simply a growing number of tags for structuring display, but a whole new language that enabled us to interact with each other and with information in new ways. What if all those millions who have somehow learned to build Web sites and use software to write HTML formatted text, began using equally powerful software to embed codes about that information, codes that allowed other machines to process that information? XML is a language that can communicate information about content and provide us with browsers that understand this information. The appearance of such a language might seem like a minor event in the history of computing, an incremental change somewhat on the order of version 4 of one of the browsers appearing to replace version 3. It may, however, turn out to be something much more substantial and significant. The reason for this is that XML is not simply new software but a new language. It will have an impact by energizing all those millions who already are representing their businesses, services, products, and creative energies on the Web to provide data on what their enterprises are actually about. All this data will be as machine readable as information in a form or database field and as capable of being processed by other machines. This data becomes the raw material for new informational products and the catalyst for new informational activities. XML therefore, may be as much of an energizing force as html has been. And this force may be unleashed fairly soon. What is rather remarkable about XML is that its appearance is imminent and yet, at the same time, largely a secret. XML may be the Web's biggest secret, albeit not a very well-kept secret. It is not hard, even using conventional search engines, to find Web sites devoted to XML (http://www.w3.org/XML/, http://www.arbortext.com), a major one being on the Microsoft Web site (http://microsoft.com/xml/). It is also not hard to figure out why XML is almost universally attractive. What is driving the movement to XML is commerce. XML will allow a currently static Web page to function like a database, but without requiring the resources now required to put a database on the Web. Because of this value to the commercial sector, Microsoft and all the major software companies have encouraged the development of XML. Some features for browsing XML sites are even contained in Internet Explorer 4.0. As Nick Finke, Director of the University of Cincinnati Law School Center for Electronic Text in the Law notes, "since XML relies completely on known technology, it should arrive on time." While commerce is driving the development of XML, anyone offering informational services or products will be provided new tools to create, present, acquire, process, and manage data. Anyone in an information business (law?) will be challenged to compete even more than now to define where their expertise lies and what is inherently valuable about their work. What does this mean for lawyers? It probably means more competition, as tools of access to information are enhanced and exclusive control over all bodies of information becomes harder to justify. But XML also promises to foster new markets and marketplaces, new economic relationships and partnerships, and new patterns of interaction among people and groups, all of which will require new models for dealing with conditions of risk and complexity. It certainly will lead to new kinds of conflicts over the use and ownership of information. As a result, those lawyers who take note of the arrival of XML may very well think "who could ask for anything more." *** University of Massachusetts Center for Information Technology and Dispute Resolution |