Friday, April 3, 2015

Sysadmin Interview Questions (circa 2000)

At one of the not-swift places places I worked that had a high turnover rate, we put together the following set of interview questions. The most important and basically operative interview question was the last one in the list.


In order to have a successful interview at our-company, you  must be able to answer the following questions correctly:

1. What kinds of files are generally kept in /tmp?
  a) files that could be erased
  b) important records and sensitive market data that need to saved
  c) binaries
  d) comma files (bonus points for describing what comma files are)
  e) personal mail you'd rather your manager not read

2. What are files ending in the extension .o used for?

3. Scenario:
  Let's say there is a program which hogs resources and occasionally
  doesn't complete. What should be done?

  a) understand why it doesn't complete and possibly fix
  b) run it from crontab every 15 minutes
  c) reboot

4. Scenario:
    Let's say there is a mail reader that corrupts the mailbox when /tmp
    fills up. How would you deal with this situation?

  a) fix the mail reader
  b) use another mail reader
  c) use existing SNMP monitors to check for filesystems which are more than
     95% full
  d) increase /tmp
  e) reboot
  f) remember what you did to fix the corrupted file the last time you had this
     problem so you could quickly do it again

5. What are the important options of the AIX shutdown command?

6. What is the best program editor ever created?
  a) more
  b) cat
  c) vi

7. Essay: discuss the advantages of rebooting hundreds of servers at one time.
  [Hint: Make sure you include ease of remembering and quickness of pain]

8. Are you a light sleeper?

Answers:

1. all except a. See also question 4.
A comma file is the predecessor to the "dot" file e.g. ,profile ,login ,rhosts. Comma comes before dot in the ASCII collating sequence. As predecessor, it doesn't have the feature that dot files have of not being expanded by metacharacters

It is used by applications programmers to make Korn shell code look more sophisticated. For example:

  cat </dev/null>,verbose
2. This is a silly question because at our company, the concept of a filetype or class of file doesn't exist. However some Systems  Administrators use .o for the extension on backup copies or "old" copies of a file. It is handy to use because some programs the C  compiler and some Makefiles will remove them for you automatically.

3. b. And also every 5 minutes

4. d. e & f are only for Senior System Administrators

5. -F 0. [fast,  wait 0 seconds] Some applicants will proffer -r (reboot),  but that is optional.

6. A trick question: SA's don't program.

7. You can test not only the hardware but also the network as all the
   servers try make connection with the single nameserver. It might
   simulate what would happen during a nuclear attack.

8. SA's never sleep!


Executive Email server

The first part of a series on Sysadmin horror stories.

I worked at a large financial organization where the CEO didn't "do" email.

A clever first-line network manager had the idea to get the CEO into the late 20th century. (Yes, it was back then, but still it was at a time when the CEO had no excuse not to use email.)

The CEO had a secretary who handled most of his paper correspondence and meetings and such.

Since there were no official resources to for this, the manager relegated an old Solaris Sparc server dedicated just for this guy's email. Given the technology at the time — this was before cloud email providers and virtual machines — this was a perfectly reasonable thing to do.  In fact, I thought it clever.

To isolate it from everything else, he gave the server its own DNS domain name inside the
company. That too was perfectly reasonable.

The domain name had company-exec in the DNS name to try to entice people to send email to use it.

All of this was working fine initially. The CEO didn't get that much email as he didn't use it. Sometimes, his secretary would.

Over time, though, as some of the managers who worked underneath found out about this email address they would request a boutique email on that domain as well. And the managers under the top-level manager who were a little more email savvy didn't get really get that much email either.

At that point, I think the network manager tried to get a bigger disk for the little server, but since the CEO was a tightwad about such things and didn't care about email, he wouldn't approve it.

Now here's where things started to go awry. We now get to the the next level of managers who saw that their managers and the CEO all had this exclusive email domain. They now wanted an email account in that domain and on that server.

People at that level did know how to use email, sort of, and got a lot of it. They were the kind of people that insisted on being on automated email lists. The kind of people who 90% of their email is a forward of (detailed, automated, or already forwarded) email that was sent to them. They would respond with the entire text adding the choice commentary that they got paid so much for:
underling -
plz address 
Yes, these kinds of managers were essentially overpaid switchboard operators. And were it not for the fact that most of these people were male, I wouldn't be surprised if they were in fact laid-off switchboard operators of an earlier era.

Or rather switchboard multiplexers because the email would cc'd to the 10 other people on the email list, some of which were also on using the executive email server.

It was becoming clear that the little server just didn't have the disk space to handle this. Again attempts to get a beefier server — to no avail.

The shit hit the fan due when it got coupled with another quirk.

In general, application software and financial trading software wasn't written all that well. The company relied on the steely nerves and quick responses of systems administrators to correct for application errors. (More on that in another blog).

One of the flaky programs for one of the real-time trading systems had a custom program written for it that checked the health every second (or less) because, well, this was an important real-time program. And what it did when the programmed stopped working, was spam the email system with alerts emails.

So shortly after that next level of managers was added to the "executive" email system, there was a problem with the crappy real-time trading application; the server was flooded with gigabytes of emails alerts to all of the executives with a smattering of their cc's back to other people, some on the executive email server, to fix. The end result was that during this crises, all of them were locked out of sending and receiving emails.

We took  advantage of the outage caused by that server to swap it out with another beefier box where we had the ability to add a larger disk later.  As for approving the a larger disk, the secretary slipped an invoice of about $200 for it in-between two of his travel vouchers which was an order of magnitude larger.

Aside from the other features of this story, an aspect about this that haunts me is how a series of reasonable local decisions all individually logical or maybe clever led to a total disaster.  I
sometimes think writing software is like that too.

The "Test Driven Development" philosophy addresses this by embracing "refactoring" code. That is, as part of the process is a step where you reassess and rework before going further.