With Microscope and Tweezers: Introduction

With Microscope and Tweezers:

An Analysis of the Internet Virus of November 1988

Introduction

Organization
A Rose by Any Other Name
Goals and Targets
Major Points
- How it entered
- Who it attacked
- What it attacked
- What it did NOT do

The Internet [internet][notablenets], a collection of interconnected networks linking approximately 60,000 computers, was attacked by a virus program on 2 November 1988. The Internet community is comprised of academic, corporate, and goverment research users, all seeking to exchange information to enhance their research efforts.

The virus broke into Berkeley Standard Distribution (BSD) UNIX and derivative systems. Once resident in a computer, it attempted to break into other machines on the network. This paper is an analysis of that virus program and of the reaction of the Internet community to the attack.

Footnotes:

UNIX is a trademark of AT&T. DEC, VAX, and Ultrix are trademarks of Digitial Equipment Corporation. Sun, SunOS, and NFS are trademarks of Sun Microsystems, Inc. IBM is a trademark of International Business Machines, Inc.

Organization

In Section [intro] we discuss the categorization of the program which attacked the Internet, the goals of the teams working on isolating the virus and the methods they employed, and summarize what the virus did and did not actually do. In Section [strat] we discuss in more detail the strategies it employed, the specific attacks it used, and the effective and ineffective defenses proposed by the community.

Section [chron] is a detailed presentation of the chronology of the virus. It describes how our group at MIT found out and reacted to the crisis, and relate the experiences and actions of select other groups throughout the country, especially as they interacted with our group.

Once the crisis had passed, the Internet community had time not only to explore the vulnerabilities which had allowed the attack to succeed, but also to consider how future attacks could be prevented. Section [lessons] presents our views on the lessons learned and problems to be faced in the future. In Section [acks] we acknowledge the people on our team and the people at other sites who aided us in the effort to understand the virus.

We present a subroutine by subroutine description of the virus program itself in Appendix [progappendix], including a diagram of the information flow through the routines which comprise the ``cracking engine''. Appendix [dict] contains a list of the words included in the built-in dictionary carried by the virus.

Finally in Appendix [cast] we provide an alphabetized list of all the people mentioned in this paper, their affiliations, and their network mail addresses.

A Rose by Any Other Name

The question of how to classify the program which infected the Internet has received a fair amount of attention. Was it a ``virus'' or ``worm''; or was it something else?

There is confusion about the term ``virus.'' To a biologist a virus is an agent of infection which can only grow and reproduce within a host cell. A lytic virus enters a cell and uses the cell's own metabolic machinery to replicate. The newly created viruses (more appropriately called ``virons'') break out of the infected cell, destroying it, and then seek out new cells to infect. A lysogenetic virus, on the other hand, alters the genetic material of its host cells. When the host cell reproduces it unwittingly reproduces the viral genes. At some point in the future, the viral genes are activated and many virons are produced by the cell. These proceed to break out of the cell and seek out other cells to infect [biovirus2]. Some single strand DNA viruses do not kill the host cell; they use the machinery of the host cell to reproduce (perhaps slowing normal celluar growth by diverting resources) and exit the cells in a non-destructive manner[biossdna].

A ``worm'' is an organism with an elongated segmented body. Because of the shape of their bodies worms can snake around obstacles and work their way into unexpected places. Some worms, for example the tapeworm, are parasites. They live inside of a host organism, feeding directly from nutrients intended for host cells. These worms reproduce by shedding one of their segments which contains many eggs. They have difficulty in reaching new hosts, since they usually leave an infected host through its excretory system and may not readily come into contact with another host[bioworm].

In deciding which term fits the program which infected the Internet, we must decide which part of the system is analogous to the ``host''. Possibilities include the network, host computers, programs, and processes. We must also consider the actions of the program and its structure.

Viewing the network layer as the ``host'' is not fruitful; the network was not attacked, specific hosts on the network were. The infection never spread beyond the Internet even though there were gateways to other types of networks. One could view the infection as a worm, which ``wiggled'' throughout the network. But as Beckman points out[ncsc] the program didn't have connected ``segments'' in any sense. Thus it can't be a worm.

A model showing the computers as the ``host'' is more promising. The infection of 2 November entered the hosts, reproduced, and exited in search of new hosts to infect. Some people might argue that since the host was not destroyed in this process, that the infecting program was more like a worm than a virus. But, as mentioned earlier, not all viruses destroy their host cells. Denning [denning] defines a computer worm as a program which enters a workstation and disables it. In that sense the infection could be considered a worm, but we reject this definition. The infected computers were affected but not all were ``disabled''. There is also no analog to the segments of a biological worm.

Denning has described how many personal computer programs have been infected by viral programs[denning]. These are frequently analogous to lysogenetic viruses because they modify the actual program code as stored in the computer's secondary storage. As the infected programs are copied from computer to computer through normal software distribution, the viral code is also copied. At some point the viral code may activate and perform some action such as deleting files or displaying a message. Applying this definition of a virus while viewing programs as ``hosts'' does not work for the Internet infection, since the virus neither attacked nor modified programs in any way.

If, however, processes are view as ``hosts'', then the Internet infection can clearly be considered a viral infection. The virus entered hosts through a daemon process, tricking that process into creating a viral process, which would then attempt to reproduce. In only one case, the finger attack, was the daemon process actually changed; but as we noted above only lysogenetic viruses actually change their host's genetic material.

Denning defines a bacterium as a program which replicates itself and feeds off the host's computational resources. While this seems to describe the program which infected the Internet, it is an awkward and vague description which doesn't seem to convey the nature of the infection at all.

Thus we have chosen to call the program which infected the Internet a virus. We feel it is accurate and descriptive.

Goals and Targets

The program that attacked many Internet hosts was itself attacked by teams of programmers around the country. The goal of these teams was to find out all the inner workings of the virus. This included not just understanding how to stop further attacks, but also understanding whether any permanent damage had been done, including destruction or alteration of data during the actual infection, or possible ``time bombs'' left for later execution.

There were several steps in achieving these goals: including

isolating a specimen of the virus in a form which could be analyzed.
``decompiling'' the virus, into a form that could be shown to reduce to the executable of the real thing, so that the higher level version could be interpreted.
analyzing the strategies used by the virus, and the elements of its design, in order to find weaknesses and methods of defeating it.

The first two steps were completed by the morning of 4 November 1988. Enough of the third was complete to determine that the virus was harmless, but there were no clues to the higher level issues, such as the reason for the virus' rapid spread.

Once the decompiled code existed, and the threat of the virus known to be minimal, it was clear to the MIT team and those at Berkeley that the code should be protected. We understood that the knowledge required to write such a program could not be kept secret, but felt that if the code were publicly available, someone could too easily modify it and release a damaging mutated strain. If this occurred before many hosts had removed the bugs which allowed the penetration in the first place, much damage would be done.

There was also a clear need to explain to the community what the virus was and how it worked. This information, in the form of this report, can actually be more useful to interested people than the source code could be, since it includes discussion of the side effects and results of the code, as well as flaws in it, rather than merely listing the code line by line. Conversely, there are people interested in the intricate detail of how and why certain routines were used; there should be enough detail here to satisfy them as well. Readers will also find Seely [seely] and Spafford's[spafpaper] papers interesting.

Major Points

This section provides an outline of the how the virus attacked and who it attacked. It also lists several things the virus did not do, but which many people seem to have attributed to the virus. All of the following points are described in more detail in Section [strat].

How it entered

sendmail (needed debug mode, as in SunOS binary releases)
finger[finger] (only VAX hosts were victims)
remote execution system, using
- rexec
- rsh

Who it attacked

accounts with obvious passwords, such as
- none at all
- the user name
- the user name appended to itself
- the ``nickname''
- the last name
- the last name spelled backwards
accounts with passwords in a 432 word dictionary (see Appendix [dict])
accounts with passwords in /usr/dict/words
accounts which trusted other machines via the .rhosts mechanism

What it attacked

SUNs and VAXes only
machines in /etc/hosts.equiv
machines in /.rhosts
machines in cracked accounts' .forward files
machines in cracked accounts' .rhosts files
machines listed as network gateways in routing tables
machines at the far end of point-to-point interfaces
possibly machines at randomly guessed addresses on networks of first hop gateways

What it did NOT do

gain privileged access (it almost never broke in as root)
destroy or attempt to destroy any data
leave time bombs behind
differentiate among networks (such as MILNET, ARPANET)
use UUCP at all
attack specific well-known or privileged accounts such as root