Web Dragons

1st Edition

Inside the Myths of Search Engine Technology

Authors: Ian Witten Marco Gori Teresa Numerico
Paperback ISBN: 9780123706096
eBook ISBN: 9780080469096
Imprint: Morgan Kaufmann
Published Date: 3rd November 2006
Page Count: 288
Tax/VAT will be calculated at check-out
Compatible Not compatible
VitalSource PC, Mac, iPhone & iPad Amazon Kindle eReader
ePub & PDF Apple & PC desktop. Mobile devices (Apple & Android) Amazon Kindle eReader
Mobi Amazon Kindle eReader Anything else

Institutional Access

Table of Contents


  1. Setting the Scene According to the Philosophers Enter the Technologists The Information Revolution The World-Wide Web So What? Notes and Sources

  2. Literature and The Web Changing Face of Libraries Metadata So What? Notes and Sources

  3. Meet the Web Basic Concepts Web Pages: Documents and Beyond Metrology and Scaling Structure of the Web So What? Notes and Sources

  4. How to Search Searching Text Searching in a Web Developments in Web Search So What? Notes and Sources

  5. The Web Wars Preserving the Ecosystem Increasing Visibility: Tricks of the Trade Business, Ethics, and Spam The Anti-Spam War So What? Notes and Sources

  6. Who Controls Information? The Violence of the Archive Web Democracy Privacy and Censorship Copyright and the Public Domain The Business of Search So What? Notes and Sources

  7. The Dragons Evolve Communities Private Subnetworks The User as Librarian Your Computer and the Web So What? Notes and Sources




In the eye-blink that has elapsed since the turn of the millennium, the lives of those of us who work with information have been utterly transformed. Pretty well all we need to know is on the web; if not today, then tomorrow. It's where we learn and play, shop and do business, keep up with old friends and meet new ones. What makes it possible for us to find the stuff we need to know? Search engines.

Search engines - "web dragons" - are the portals through which we access society's treasure trove of information. How do they stack up against librarians, the gatekeepers over centuries past? What role will libraries play in a world whose information is ruled by the web? How is the web organized? Who controls its contents, and how do they do it? How do search engines work? How can web visibility be exploited by those who want to sell us their wares? What's coming tomorrow, and can we influence it? We are witnessing the dawn of a new era, starting right now - and this book shows you what it will look like and how it will change your world.

Do you use search engines every day? Are you a developer or a librarian, helping others with their information needs? A researcher or journalist for whom the web has changed the very way you work? An online marketer or site designer, whose career exists because of the web? Whoever you are: if you care about information, this book will open your eyes - and make you blink.

About the authors: Ian H. Witten is professor of computer science at the University of Waikato, where he directs the New Zealand Digital Library research project. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. A fellow of the ACM, he has written several books, including How to Build a Digital Library (2002) and Data Mining (2005), both from Morgan Kaufmann.<BR id="C

Key Features

  • Presents a critical view of the idea of funneling information access through a small handful of gateways and the notion of a centralized index--and the problems that may cause.

  • Provides promising approaches for addressing the problems, such as the personalization of web services.

  • Presented by authorities in the field of digital libraries, web history, machine learning, and web and data mining.

  • Find more information at the author's site: webdragons.net.


Those interested in or who need information on today's fast-changing landscape of information access, who use search engines daily and may be affected by web spamming, selective access to information, or the problems of monopolistic control of information – just to name a few. Typical readers would be those in the software business, in particular in search engines, web content management, knowledge management, web advertising, and the law and ethics that surround this field; professionals in information science; librarians; and anyone that is interested in the ways in which the increasing amount of information will become accessible to us.


No. of pages:
© Morgan Kaufmann 2007
Morgan Kaufmann
eBook ISBN:
Paperback ISBN:


It is not a resource on how search engines work, but rather what ideas and ideals have been realized in the development of search engines, the political and human challenges they face and problems and opportunities they present to humans and to the nature of knowledge and information. The book is written in a clear, simple fashion, making it accessible to all readers. The broad swath it cuts, however, does not detract from its use as an academic course resource.- Choice, June 2007 If you've ever searched the web for information and wondered what's going on behind that query box, I recommend you read Web Dragons. It puts Internet search engines in context—part of a legacy of information access dating back thousands of years. It explains in plain language how search engines work, and points out potential pitfalls that thoughtful searchers should consider. Web Dragons is clear and engaging. Given the amount of time and trust we all invest in search engines, if you pay attention to the web I highly recommend redirecting some of that attention to this book. --Craig Nevill-Manning, Engineering Director, Google Search technology is changing the way people understand and interact with the world. Web Dragons takes a revealing look at the evolution of search and how it will shape the future of information technology. --Prabhakar Raghavan, Head of Yahoo! Research Witten, Gori and Numerico steadily bring the web into sharper and sharper focus. A daunting expanse is revealed to have structure. The structure enables the knowledgeable to navigate it to their benefit and allows the unscrupulous or careless to create pitfalls and traps. Search engines will be critical tools for most people living today. What could be more important than understanding how these technologies work and where they are going? --Jonathan Grudin, Microsoft Research

About the Authors

Ian Witten Author

Ian H. Witten is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. He has written several books, the latest being Managing Gigabytes (1999) and Data Mining (2000), both from Morgan Kaufmann.

Affiliations and Expertise

Professor, Computer Science Department, University of Waikato, New Zealand.

Marco Gori Author

Professor Gori's research interests are in the field of artificial intelligence, with emphasis on machine learning and game playing. He is a co-author of the book “Web Dragons: Inside the myths of search engines technologies,” Morgan Kauffman (Elsevier), 2007. He was the Chairman of the Italian Chapter of the IEEE Computational Intelligence Society, and the President of the Italian Association for Artificial Intelligence. He is in the list of top Italian scientists kept by VIAAcademy (http://www.topitalianscientists.org/top_italian_scientists.aspx). Dr. Gori is a fellow of the IEEE, ECCAI, and IAPR.

Affiliations and Expertise

Department of Information Engineering and Mathematics, University of Siena, Italy

Teresa Numerico Author

Affiliations and Expertise

University of Salerno, Italy