Forthcoming titles in the ACM Books Series are subject to change and will be published as they become available, with 25 titles to be published in Collection I. Upon publication, each of the following books will appear in the ACM Digital Library and be accessible to those with full-text access in both PDF and ePub formats. Individual titles will be made available for purchase at Morgan & Claypool and also available at Amazon and Barnes & Noble. Please click on the title name below for more information about each title.
Structural bioinformatics is the field related to the development and application of computational models for the prediction and analysis of macromolecular structures. The unique nature of protein and nucleotide structures has presented many computational challenges over the last three decades. The fast accumulation of data, in addition to the rapid increase in computational power, presents a unique set of challenges and opportunities in the analysis, comparison, modeling, and prediction of macromolecular structures and interactions.
The book is intended as a user's guide for key algorithms to solve problems related to macromolecular structure, with emphasis on protein structure, function and dynamics. It can be used as a textbook for a one-semester graduate course in algorithms in bioinformatics.
Code Nation is a popular history of programming and software culture from the first years of personal computing in the 1970s to the early commercial infrastructure of the World Wide Web. This illustration-rich book offers profiles of ACM members and luminaries who have had an important influence on programming practices, as well as the formative experiences of students, power users, and tinkerers who learned to code on early PCs and built captivating games and applications.
Central to this history is the learn to program movement, an educational agenda that germinated in government labs, gained momentum through business and counterculture experiments, and became a broad-based computer literacy movement in the 1970s and 80s.
Despite conflicts about languages, operating systems, and professional practices, the number of active programmers in America jumped from tens of thousands in the late 1950s to tens of millions by the early 1990s. This surge created a groundswell of popular support for programming culture, resulting in a “Code Nation”—a globally-connected society saturated with computer software and enchanted by its use.
From Siri to Alexa to Cortana, conversational interfaces are hitting the mainstream and becoming ubiquitous in our daily lives. However, user experiences with such applications remain disappointing. Although it is easy to get a system to produce words, none of the current agents or bots display general conversational competence. Modeling natural conversation is still a hard problem. Conversational UX Design: A Methodology for Practitioners, coming from ACM Books in 2018, will help change that.
In applying Conversation Analysis to Conversational UX Design, authors Robert J. Moore and Raphael Arar demonstrate a methodology for designing conversational user experiences, address several topics in conversational UX design and present multiple conversational UX patterns.
Bob Moore is a Research Staff Member at IBM Research-Almaden, where he examines the intersection of human conversation and technology. Currently, he is developing a methodology for Conversational UX Design that applies the formal, qualitative models of natural human conversation, from the field of Conversation Analysis, to the design of conversational interfaces. He has developed a general Natural Conversation Framework, also implemented on the IBM Watson Conversation service, which defines a set of conversational UX patterns.
Raphael Arar is a Designer & Researcher at IBM Research and Adjunct Faculty at San Jose State University in their Digital Media Art Program. He is a recent Forbes 30 Under 30 honoree in Enterprise Technology. Previously he was the Lead UX Designer for the Apple + IBM Partnership, Lecturer at the University of Southern California’s Media Arts + Practice Division and an Art & Technology Fellow at CalArts.
Data quality is one of the most important problems in data management, since dirty data often leads to inaccurate data analytics results and bad business decisions. Poor data across businesses and the government cost the U.S. economy $3.1 trillion a year, according to a report by InsightSquared in 2012.
Various tools and techniques have been proposed to detect data errors and anomalies. For example, data quality rules or integrity constraints have been proposed as a declarative way to describe legal or correct data instances. Any subset of data that does not conform to the defined rules is considered erroneous, which is also referred to as a violation.
Repairing dirty data sets is often a more challenging task. Multiple techniques with different objectives have been introduced. Some of these aim to minimally change the database, such that the data conforms to the declared quality rules; others involve users or knowledge bases to verify the repairs.
In this book, we discuss the main facets and directions in designing error detection and repairing techniques. We start by surveying anomaly detection techniques, based on what, how, and where to detect. We then propose a taxonomy of the various aspects of data repairing, including the repair target, the automation of the repair process, and the update model. The book also highlights new trends in data cleaning algorithms to cope with current Big Data settings, focusing on scalable data cleaning techniques for large data sets.
Database replication is widely used for fault-tolerance, scalability, and performance. The failure of one database replica does not stop the system from working as available replicas can take over the tasks of the failed replica. Scalability can be achieved by distributing the load across all replicas, and adding new replicas should the load increase. Finally, database replication can provide fast local access, even if clients are geographically distributed clients, if data copies are located close to clients. Despite its advantages, replication is not a straightforward technique to apply, and there are many hurdles to overcome. At the forefront is replica control: assuring that data copies remain consistent when updates occur. There exist many alternatives in regard to where updates can occur and when changes are propagated to data copies, how changes are applied, where the replication tool is located, etc. A particular challenge is to combine replica control with transaction management as it requires several operations to be treated as a single logical unit, and it provides atomicity, consistency, isolation and durability across the replicated system.
This book provides a categorization of replica control mechanisms, presents several replica and concurrency control mechanisms in detail, and discusses many of the issues that arise when such solutions need to be implemented within or on top of relational database systems. Furthermore, the book presents the tasks that are needed to build a fault-tolerant replication solution, provides an overview of load-balancing strategies that allow load to be equally distributed across all replicas, and introduces the concept of self-provisioning that allows the replicated system to dynamically decide on the number of replicas that are needed to handle the current load. As performance evaluation is a crucial aspect when developing a replication tool, the book presents an analytical model of the scalability potential of various replication solution.
Martin Hellman and Whitfield Diffie won the 2015 Turing Award for the development of public-key cryptography. This book provides original biographies of the award winners and describes the historical and political context and impact of their research, including its influence on the development of internet security, theoretical computer science, and national security. It also summarizes and compiles key documents, including the original research articles that led to the Turing Award, interviews with Hellman and Diffie, and the Turing Award lectures.
In this book we aim to show how data mining and machine learning techniques are used in the context of event mining. We review event recognition and event discovery in different applications and cover recent developments in event mining such as techniques for temporal pattern mining, temporal data classification and clustering. We also introduce EventMiner as a comprehensive knowledge-based event mining framework for analyzing heterogeneous big data.
The book will be useful for both practitioners and researchers working in different computer science fields. Data miners/scientists and data analysts can benefit from high-performance event mining techniques introduced in this book. Also, The book is accessible to many readers and not necessarily just those with strong backgrounds in computer science. Public health professionals, epidemiologists, physicians, and social scientists can benefit from the new perspective of this book in harnessing the value of heterogeneous big data for building diverse real-life applications.
ACM Books is pleased to announce the signing of a new book in our Turing Award series, Foundations of Computation and Machine Learning: The Work of Leslie Valiant, edited by Rocco Servedio of Columbia University.
Valiant received the prestigious ACM Turing Award in 2010 for 2010 "For transformative contributions to the theory of computation, including the theory of probably approximately correct (PAC) learning, the complexity of enumeration and of algebraic computation, and the theory of parallel and distributed computing."
The book will feature a short biography of Valiant, as well as analysis of his seminal works by today's leading computer scientists.
Nash equilibrium is the central solution concept in Game Theory. Since Nash’s original paper in 1951, it has found countless applications in modeling strategic behavior of traders in markets, (human) drivers and (electronic) routers in congested networks, nations in nuclear disarmament negotiations, and more. A decade ago, the relevance of this solution concept was called into question by computer scientists [Chen et al. 2009b, Daskalakis et al. 2009a], who proved (under appropriate complexity assumptions) that computing a Nash equilibrium is an intractable problem. And if centralized, specially designed algorithms cannot find Nash equilibria, why should we expect distributed, selfish agents to converge to one? The remaining hope was that at least approximate Nash equilibria can be efficiently computed. Understanding whether there is an efficient algorithm for approximate Nash equilibrium has been the central open problem in this field for the past decade. In this book, we provide strong evidence that even finding an approximate Nash equilibrium is intractable. We prove several intractability theorems for different settings (two-player games and many-player games) and models (computational complexity, query complexity, and communication complexity). In particular, our main result is that under a plausible and natural complexity assumption (“Exponential Time Hypothesis for PPAD”), there is no polynomial-time algorithm for finding an approximate Nash equilibrium in two-player games.
The problem of approximate Nash equilibrium in a two-player game poses a unique technical challenge: it is a member of the class PPAD, which captures the complexity of several fundamental total problems, i.e. problems that always have a solution; and it also admits a quasipolynomial (_ nlogn) time algorithm. Either property alone is believed to place this problem far below NP-hard problems in the complexity hierarchy; having both simultaneously places it just above P, at what can be called the frontier of intractability. Indeed, the tools we develop in this book to advance on this frontier are useful for proving hardness of approximation of several other important problems whose complexity lies between P and NP:
Brouwer’s fixed point Given a continuous function f mapping a compact convex set to itself, Brouwer’s fixed point theorem guarantees that f has a fixed point, i.e. x such that f (x) = x. Our intractability result holds for the relaxed problem of finding an approximate fixed point, i.e. x such that f (x) _ x.
Market equilibrium Market equilibrium is a vector of prices and allocations where the supply meets the demand for each good. Our intractability result holds for the relaxed problem of finding an approximate market equilibrium, where the supply of each good approximately meets the demand.
CourseMatch (A-CEEI) Approximate Competitive Equilibrium from Equal Income (A-CEEI) is the economic principle underlying CourseMatch, a system for fair allocation of classes to students (currently in use at Wharton, University of Pennsylvania).
Densest k-subgraph Our intractability result holds for the following relaxation of the k-Clique problem: given a graph containing a k-clique, the algorithm has to find a subgraph over k vertices that is “almost a clique”, i.e. most of the edges are present.
Community detection We consider a well-studied model of communities in social networks, where each member of the community is friends with a large fraction of the community, and each non-member is only friends with a small fraction of the community.
VC dimension and Littlestone dimension The Vapnik-Chervonenkis (VC) dimension is a fundamental measure in learning theory that captures the complexity of a binary concept class. Similarly, the Littlestone dimension is a measure of complexity of online learning.
Signaling in zero-sum games We consider a fundamental problem in signaling, where an informed signaler reveals private information about the payoffs in a two-player zero-sum game, with the goal of helping one of the players.
This book addresses challenges in moving from homogeneous computing to heterogeneous computing where it is necessary to deal with computing nodes of different capabilities and characteristics such as general purpose cores, GPUs, FPGAs, Automata Processing, Neuromorphic chips.
Online advertising has grown from almost nothing at the end of last century to an annual spend of over 200B dollars globally. Today, online advertising garners the most advertising dollars of any advertising channel including TV. Online advertising is computational advertising since most the decisions of which ads to show to a user in a given context are determined by algorithms. Indeed, computational advertising was one of the first big data applications. For this reason, the problems behind computational advertising have driven research into large-scale machine learning and algorithmic game theory and is responsible for many advances in those areas as well as in parallel computing architectures. This book covers the current state of the art of computational advertising. That includes the economics of online advertising, understanding and modeling consumer behavior, matching ads and consumers, user response prediction and measurement of ad effectiveness. We also cover ad allocation, campaign management and optimization, as well as fraud and privacy issues. Today, computational advertising intersects computer science, economics marketing and psychology. Hence, after 20 years of advances in this field we hope this book fills the needs of researchers, practitioners and graduate students who want to understand the state of the art in this multidisciplinary area.
Stephen A. Cook was awarded the ACM Turing Award in 1982, in recognition of "his advancement of our understanding of the complexity of computation in a significant and profound way." Cook's theory of NP-completeness is one of the most fundamental and enduring contributions in computer science, and has a singificant impact outside the field. This volume will present works on NP-completeness and other contributions which, while perhaps not as well known, has also had a significant impact on computing theory and practice, as well as mathematical logic. With additional material, including a biographical chapter, Professor Cook's Turing Award address, and a full bibliography of his work, the volume will provide an excellent resource for anyone wishing to understand the foundations of Cook's work as well as its ongoing significance and relevance to current research problems in computing and beyond.
Pointer analysis provides information to disambiguate indirect reads and writes of data through pointers and indirect control flow through function pointers or virtual functions. Thus it enables application of other program analyses to programs containing pointers. There is a large body of literature on pointer analysis. However, there is no material that brings out a uniform coherent theme by separating fundamental concepts from advanced techniques and tricks. This book fills this void.
The book focuses on fundamental concepts instead of trying to cover the entire breadth of the literature on pointer analysis. Bibliographic notes point the reader to relevant literature for more details.
Rather than being driven completely by pointer analysis’s practical effectiveness, the book evolves the concepts from the first principles based on the language features, brings out the interactions of different abstractions at the level of ideas, and finally, relates them to practical observations and the nature of practical programs.
Principles of Graph Data Management and Analytics is the first textbook on the subject for upper-level undergraduates, graduate students and data management professionals who are interested in the new and exciting world of graph data management and computation. The book blends together the two thinly connected disciplines – a database-minded approach to managing and querying graphs, and an analytics-driven approach to perform scalable computation on large graphs. It presents a detailed treatment of the underlying theory and algorithms, and prevalent techniques and systems; it also presents textbook use cases and real-world problems that can be solved by combining database-centric and analysis-centric approaches. The book will enable students to understand the state of the art in graph data management, to effectively program currently available graph databases and graph analytics products, and to design their own graph data analysis systems.To help this process, the book supplements its textual material with several data sets, small and large, that will be made available through the book’s website. Several free and contributed software will also be provided for readers for practice.
The book brings together some of Goldwasser and Micali's seminal research, analyzed and discussed in light of the lasting effect it has had on computer science and cryptography.
Working together, Goldwasser and Micali pioneered the field of provable security, which laid the mathematical foundations that made modern cryptography possible. By formalizing the concept that cryptographic security had to be computational rather than absolute, they created mathematical structures that turned cryptography from an art into a science. Their work addresses important practical problems such as the protection of data from being viewed or modified, providing a secure means of communications and transactions over the Internet. Their advances led to the notion of interactive and probabilistic proofs and had a profound impact on computational complexity, an area that focuses on classifying computational problems according to their inherent difficulty.
This book discusses the capabilities of Linked-Data and the Semantic Web modeling languages, such as RDFS (Resource Description Framework Schema) and OWL (Web Ontology Language) as well as more recent standards based on these. The book provides examples to illustrate the use of Semantic Web technologies in solving common modeling problems with many exercises and examples of the use of the techniques.
The book provides an overview of the Semantic Web and aspects of the Web and its architecture relevant to Linked Data. It then discusses semantic modeling and how it can support the development from chaotic information gathering to one characterized by information sharing, cooperation, and collaboration. It also explains the use of RDF and linked-data to implement the Semantic Web by allowing information to be distributed over the Web or over intranets, along with the use of SPARQL to access RDF data.
Moreover, the reader is introduced to components that make up a Semantic Web deployment and how they fit together, the concept of inferencing in the Semantic Web, and how RDFS differs from other schema languages. In addition, the 2015 “Linked Data Platform” standard is also explored. The book also considers the use of SKOS (Simple Knowledge Organization System) to manage vocabularies by taking advantage of the inferencing structure of RDFS-Plus. It also presents SHACL, a language for checking graph constraints in linked data systems, and a number of useful ontologies includingschema.org, the most successfully deployed Semantic Web technology to date.
This book is intended for the linked data and semantic Web practitioner looking for clues on how to add more expressivity to allow better linking and use both on the web and in the enterprise, and for the working ontologist who is trying to create a domain model on the Semantic Web.
Software history has a deep impact on current software designers, computer scientists and technologists. Decisions and design constraints made in past are often unknown or poorly understood by current students, yet modern software systems use software based on those earlier decisions and design constraints. This work looks at software history through specific software areas and extracts student-consumable practices, learnings, and trends that are useful in current and future software design. It also exposes key areas that are highly used in modern software, yet no longer taught in most computing programs. Written as a textbook, this book uses past and current specific cases to explore the impact of specific software evolution trends and impacts.
Static program analysis studies the behavior of programs under all possible inputs. It is an area with a wealth of applications, in virtually any tool that processes programs. A compiler needs static analysis in order to detect errors (e.g., undefined variables) or to optimize code (e.g., eliminate casts or devirtualize calls). A refactoring or a program understanding tool need global program analysis in order to answer useful questions such as “where could this program variable have been set?” or “which parts of the program can influence this value?'' A security analyzer needs program analysis to determine “can the arguments of this private operation ever be affected by untrusted user input?” A concurrency bug detector needs program analysis in order to tell whether a program can ever deadlock or have races. Static program analysis is practically valuable, but it is also hard. It is telling that the quintessential undecidable computing problem, the “halting problem”, is typically phrased as a program analysis question: “can there be a program that accepts another program as input and determines whether the latter always terminates?” Other program analysis problems have given rise to some of the best known techniques and algorithms in computer science (e.g., data-flow frameworks). This book offers a comprehensive treatment of the principles, concepts, techniques and applications of static program analysis, illustrated and explained. The emphasis is on understanding the tradeoffs of different kinds of static program analysis algorithms and on appreciating the main factors for critically evaluating a static analysis and its suitability for practical tasks.
User interfaces for our increasingly varied computational devices have long been oriented toward graphical screens and virtual interactors. Since the advent of mass market graphical interfaces in the mid-1980s, most human-computer interaction has been mediated by graphical buttons, sliders, text fields, and their virtual kin.
And yet, humans are profoundly physical creatures. Throughout our history (and prehistory), our bodies have profoundly shaped our activities and engagement with our world, and each other. Despite -- and perhaps also, because of -- the many successes of keyboard, pointer, touch screen, and (increasingly) speech modalities of computational interaction, many have sought alternate prospects for interaction that more deeply respect, engage, and celebrate our embodied physicality.
For several decades, tangible and embodied interaction (TEI) has been the topic of intense technological, scientific, artistic, humanistic, and mass-market research and practice. In this book, we elaborate on many dimensions of this diverse, transdisciplinary, blossoming topic.
It is hard to imagine, but as recently as 1968, computer scientists were uncertain how best to interconnect even two computers. The notion that within a few decades the challenge would be how to interconnect millions of computers around the globe was too farfetched to contemplate. Yet, by 1988, that is precisely what was happening. The products and devices developed in the intervening years—such as modems, multiplexers, local area networks, and routers—became the linchpins of the global digital society. How did such revolutionary innovation occur? This book tells the story of the entrepreneurs who were able to harness and join two factors: the energy of computer science researchers supported by governments and universities, and the tremendous commercial demand for internetworking computers. The centerpiece of this history comes from unpublished interviews from the late 1980s with over 80 computing industry pioneers, including Paul Baran, J.C.R. Licklider, Vint Cerf, Robert Kahn, Larry Roberts, and Robert Metcalfe. These individuals give us unique insights into the creation of multi-billion dollar markets for computer-communications equipment, and they reveal how entrepreneurs struggled with failure, uncertainty, and the limits of knowledge.
The Handbook of Multimodal-Multisensor Interfaces provides the first authoritative resource on what has become the dominant paradigm for new computer interfaces—user input involving new media (speech, multi-touch, hand and body gestures, facial expressions, writing) embedded in multimodal-multisensor interfaces. This edited collection is written by international experts and pioneers in the field. It provides a textbook, reference, and technology roadmap for professionals working in this and related areas. This volume focuses on state-of-the-art multimodal language and dialogue processing, including semantic integration of modalities. The development of increasingly expressive embodied agents and robots has become an active test-bed for coordinating multimodal dialogue input and output, including processing of language and nonverbal communication. In addition, major application areas are featured for commercializing multimodal-multisensor systems, including automotive, robotic, manufacturing, machine translation, banking, communications, and others. These systems rely heavily on software tools, data resources, and international standards to facilitate their development. For insights into the future, emerging multimodal-multisensor technology trends are highlighted for medicine, robotics, interaction with smart spaces, and similar topics. Finally, this volume discusses the societal impact of more widespread adoption of these systems, such as privacy risks and how to mitigate them. The handbook chapters provide a number of walk-through examples of system design and processing, information on practical resources for developing and evaluating new systems, and terminology and tutorial support for mastering this emerging field. In the final section of this volume, experts exchange views on a timely and controversial challenge topic, and how they believe multimodal-multisensor interfaces need to be equipped to most effectively advance human performance during the next decade.