Inside the Myths of Search Engine TechnologyBy
- Ian Witten
- Marco Gori
- Teresa Numerico
In the eye-blink that has elapsed since the turn of the millennium, the lives of those of us who work with information have been utterly transformed. Pretty well all we need to know is on the web; if not today, then tomorrow. It's where we learn and play, shop and do business, keep up with old friends and meet new ones. What makes it possible for us to find the stuff we need to know? Search engines.Search engines - "web dragons" - are the portals through which we access society's treasure trove of information. How do they stack up against librarians, the gatekeepers over centuries past? What role will libraries play in a world whose information is ruled by the web? How is the web organized? Who controls its contents, and how do they do it? How do search engines work? How can web visibility be exploited by those who want to sell us their wares? What's coming tomorrow, and can we influence it? We are witnessing the dawn of a new era, starting right now - and this book shows you what it will look like and how it will change your world. Do you use search engines every day? Are you a developer or a librarian, helping others with their information needs? A researcher or journalist for whom the web has changed the very way you work? An online marketer or site designer, whose career exists because of the web? Whoever you are: if you care about information, this book will open your eyes - and make you blink. About the authors: Ian H. Witten is professor of computer science at the University of Waikato, where he directs the New Zealand Digital Library research project. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. A fellow of the ACM, he has written several books, including How to Build a Digital Library (2002) and Data Mining (2005), both from Morgan Kaufmann.Marco Gori is professor of computer science at the University of Siena, where he leads the artificial intelligence research group. He is the Chairman of the Italian Chapter of the IEEE Computational Intelligence Society, a fellow of the IEEE and of the ECCAI, and former President of the Italian Association for Artificial Intelligence. Teresa Numerico teaches network theory and communication studies at the University of Rome 3, and is a researcher in Philosophy of Science at the University of Salerno. Previously she was employed as a business development and marketing manager for various media companies, including the Italian branch of Turner Broadcasting System (CNN and Cartoon Network).
Those interested in or who need information on today's fast-changing landscape of information access, who use search engines daily and may be affected by web spamming, selective access to information, or the problems of monopolistic control of information â just to name a few. Typical readers would be those in the software business, in particular in search engines, web content management, knowledge management, web advertising, and the law and ethics that surround this field; professionals in information science; librarians; and anyone that is interested in the ways in which the increasing amount of information will become accessible to us.
Paperback, 288 Pages
Published: November 2006
Imprint: Morgan Kaufmann
It is not a resource on how search engines work, but rather what ideas and ideals have been realized in the development of search engines, the political and human challenges they face and problems and opportunities they present to humans and to the nature of knowledge and information. The book is written in a clear, simple fashion, making it accessible to all readers. The broad swath it cuts, however, does not detract from its use as an academic course resource.- Choice, June 2007 If you've ever searched the web for information and wondered what's going on behind that query box, I recommend you read Web Dragons. It puts Internet search engines in contextâpart of a legacy of information access dating back thousands of years. It explains in plain language how search engines work, and points out potential pitfalls that thoughtful searchers should consider. Web Dragons is clear and engaging. Given the amount of time and trust we all invest in search engines, if you pay attention to the web I highly recommend redirecting some of that attention to this book. --Craig Nevill-Manning, Engineering Director, Google Search technology is changing the way people understand and interact with the world. Web Dragons takes a revealing look at the evolution of search and how it will shape the future of information technology. --Prabhakar Raghavan, Head of Yahoo! Research Witten, Gori and Numerico steadily bring the web into sharper and sharper focus. A daunting expanse is revealed to have structure. The structure enables the knowledgeable to navigate it to their benefit and allows the unscrupulous or careless to create pitfalls and traps. Search engines will be critical tools for most people living today. What could be more important than understanding how these technologies work and where they are going? --Jonathan Grudin, Microsoft Research
- Preface1. Setting the Scene According to the Philosophers Enter the Technologists The Information Revolution The World-Wide Web So What? Notes and Sources 2. Literature and The Web Changing Face of Libraries Metadata So What? Notes and Sources 3. Meet the Web Basic Concepts Web Pages: Documents and Beyond Metrology and Scaling Structure of the Web So What? Notes and Sources 4. How to Search Searching Text Searching in a Web Developments in Web Search So What? Notes and Sources 5. The Web Wars Preserving the Ecosystem Increasing Visibility: Tricks of the Trade Business, Ethics, and Spam The Anti-Spam War So What? Notes and Sources 6. Who Controls Information? The Violence of the Archive Web Democracy Privacy and Censorship Copyright and the Public Domain The Business of Search So What? Notes and Sources7. The Dragons Evolve Communities Private Subnetworks The User as Librarian Your Computer and the Web So What? Notes and Sources References Index