Subject: Long jobs history #5
Date: Thu, 08 May 1997 17:46:35 EDT
From: "Naomi B. Schmidt" <nschmidt@MIT.EDU>


------- Forwarded Message

Received: from SOUTH-STATION-ANNEX.MIT.EDU by po6.MIT.EDU (5.61/4.7) id AA03825; Tue, 8 Aug 95 12:03:46 EDT
Received: from MOZART.MIT.EDU by MIT.EDU with SMTP
	id AA10266; Tue, 8 Aug 95 12:03:34 EDT
Received: by mozart.MIT.EDU (5.57/4.7) id AA26880; Tue, 8 Aug 95 12:03:34 -0400
Message-Id: <9508081603.AA26880@mozart.MIT.EDU>
To: patriot2@MIT.EDU
Cc: gjackson@MIT.EDU, cec@MIT.EDU, rar@MIT.EDU
Subject: Notes from the 8/7/95 Long Jobs meeting
Date: Tue, 08 Aug 1995 12:03:34 EDT
From: "Naomi B. Schmidt" <nschmidt@MIT.EDU>


Present were Matt, Carla, Tom, Steve, Miki, and Naomi

Miki reported on her digging into the historical lore of Patriot1.

	- We tried to make the VAX9000 into as much of an Athena workstation 
	as possible.

	- The way it worked was that the command 'patriot' on any Athena
	workstation logged you on to the machine.  Once on the VAX9000, 
	you could use the command 'submit' to submit your job.  

	- There were many bugs in submit, which we fixed.

	- The hacked kernel caused Patriot to frequently crash.

	- People had lots of complaints about their 'nice' levels.

	- Depending about the load on the machine at any particular time, a
	compute intensive job might take different amounts of time each time 
	it was run.

	- People were running all sorts of jobs on it, from IRC to mail, to
	whatever John Carr was doing.  There was no tracking or record
	keeping.

	- We gave people 10 hour tickets by default.

Our next steps are as follows (to be done by August 28, at which time we will
meet again):

	- We will recreate patriot on two low-end SPARCs (Classics or
	SPARC5's)  (Matt will take over from Miki with Yoav's help)

	- We will port dialup to Solaris so as to plug security holes.  This
	includes a sane version of sendmail so that output can be sent to 
	users (Matt)

	- Concurrently, we will do some benchmarking on a SPARC20 with two
	processors to find out what the throughput looks like (Steve Ellis)

	- We will contact Dennis Aylward to get a quote on a SPARC20 with 4
	processors, equipped as in previous notes (Naomi)

	- We will identify a pilot audience for the Fall and solidify our
	policies on rules of use - who is allowed to use it, how people
	register to use it, etc (Naomi)

Other discussion of details:

	- We will give 23 hour tickets by default.

	- We will give Matlab the IP address of the machine to allow it to
	continue running without renewal of tickets.

	- We will keep logs of processes that are run on the machine and how
	many CPU minutes/hours each takes.

	- The scheduler will 'renice' jobs to some extent to do some queue
	management.

Other ideas:

	- Rather than allowing users to log into patriot, we might write a
	script to do rkinit, so that they just submit jobs to it - that 
	way they can't just log in to read mail when the dialups are busy.

	- The service would be available to two classes of users:

		- people in classes that register to use the machine -
		instructors will register with us and will maintain the 
		lists of individual usernames 

		- people who need it to do Thesis or UROP work.  They will
		fill out an electronic form stating what they need it for and
		we will add them to the password file.  The form will state
		explicitly what the machine may and may not be used for - 
		violation of this will allow us to remove an individual's
		access.
		
		[Idea - modify the 'register' program so that they can do
		this and be added to the password file automatically]

Criteria for Success:
- --------------------

1.  We will gather stronger data on the magnitude and nature of the use of
such a facility.

2.  We will be able to decide whether a service such as this belongs in the
environment, possibly with a modified technology such as a commercial batch
software package.

3.  We will know whether there is a reasonable way that we can satisfy the
magnitude of the demand.

4.  We will have a better definition of the support requirements for such a
system.

5.  People will leave fewer workstations logged in and tied up with no one
present.

6.  The faculty say that we are addressing their instructional needs.





------- End of Forwarded Message

