Monday, October 20, 2025

Me and AP

 

There are several incidents and stories I tell folks, and as I get more "over the hill and gaining speed" think that I should write these down.

This one is about my time working at the Associated Press (AP). This was my job after leaving PaineWebber.

I alluded to working at PaineWebber in Exectutive Email server.

When I worked there in the 1990's, PaineWebber had just bought out Kidder, Peabody & Co. But in then in 2000, It was bought out, or rather "merged "into UBS.

For a while, the merged company was known as UBS PaineWebber. But in 2003, misters Paine and Webber dropped out of the picture, and that 123-year old name was retired.

PaineWebber treated people like pencils. Use them until they wear out, and then throw it out and get another. It was clear that this was happening to my manager's manager. This was just a little happening just before PaineWebber itself was about to be merged.

Through a and wiser younger colleague at PaineWebber, I got a job at the Associated Press at Rockefeller Center. 50 Rockefeller Plaza, next to "30 Rock" of TV fame, is called the Associated Press Building.

When I entered to go to my office every morning there was an entire bull-pen of reporters busily clicking away at the upgraded version of the typewriter, a computer keyboard for a desktop computer.

Even though AP had a full building in Rockerfeller Center, at that time, AP had fully occupied that buiding. Since the group I was working in a new business for AP (online news), we were in the International Building on 5th Avenue. This is known for its statue of Atlas carrying the cosmos on his back:



In a short while, I became the lead of the systems administration and databases group (of about 3 people). I occupied the corner office on the 6th floor that you can only partially see the lower left-hand corner of in the above photo. When I turned my head to the right, I could commiserate with Atlas.

 But this corner office had two windows! Here is a photo that shows that better:


If you go to the back of the statue, you will find a small drain hole for rain runoff that collects along the statue. At the right time and view, it looks like Atlas had peed there. The second fun fact is that there is a terrace on the top of the entrance way, which is accessible from the 6th floor by a simple door. And on the roof of our office was a terrace.

On a nice summer day, someone had a barbecue up there, and the barbecue smoke was billowing over down into the statue areas where people were congregating. Must have been kind of weird for the tourists there.

As I said, In front of me was Saint Patricks Cathedral:


Notice that the windows manually open!  Every year during the Saint Patricks Day parade, AP photographers would open the window and shoot the parade from that spot.

The last time I visited that spot, it was a fitness gym!

The other interesting thing about AP was its satellite dish. It was nestled in the building for AP's computer data center, exit 8A Cranbury off of I-95 and the NJ Turnpike. 

Taking a virtual drive using Google Street View I see the building but it no longer belongs to AP.

This was the largest satellite dish I've ever seen. From the roadway you wouldn't notice anything. But if you were to fly from above, you'd see this 2 story sattelite dish. I was told that they wanted this close to NYC, but not too close. And that the U.S. Government in times of emergeny could take over the satellite feed.

.



Friday, October 18, 2024

A little about Woodwind Timbre and Oboe Timbre specifically

I've always been fascinated by timbre: the quality of a sound, or tone, that distinguishes it from other sounds of the same pitch and volume.

I have (or had) perfect pitch, the ability to identify or reproduce a musical note without a reference point.

I've written quite a bit about this. Here, what I want to say (or repeat) is that I came to this by a fascination with the timbre of a particular note, the lowest note on a cheap Gemieinhardt. The lowest note on woodwind instruments is the most funky, and while the timbre variation on the flute is not as pronounced as on other instruments like the oboe (which this will largely be about), still there is some variation.

Woodwind instruments, as a class of instruments, are unusual in this respect: the variation in their timbre across individual notes, from one instrument to another, and between different instruments of the family is great. Much more than say string instruments, a piano, or electronically synthesized instruments, even electronically synthesizing woodwind instruments.

There is a reason for this, but I won't go into the technical reasons here.

But to try to get this idea of inhomogeneity between different woodwind instruments, consider opening from a woodwind quintet by Jacques Ibert, Trois Pieces Breves:


Notice how the instruments don't blend together. In this clip, to me, it feels like there are at least 3 separate color palettes for the 3 sections of this little clip. In the opening, there is the sound of all of the instruments together. And there is a section with the horn solo, which has the lower instruments. In the last section of the clip we this particular clip, there is the oboe solo with its predominantly bassoon accompaniment which gives it a more reedy palette.

Note: not all of the instruments in a woodwind quintet are strictly woodwinds; there is one brass instrument, the French Horn.

To give a more extreme sense of not blending in, here is a more extreme example that I like. It adds more brass and some percussion too:


My heartbeat and blood pressure rise every time I hear the above clip.

Now compare this to the string family in the opening passage from Elgar's Enigma Variations, Nimrod Variation:




These examples were selected and contrived to show my point in the extreme. In particular, the Elgar violin, viola, and cello parts are divided so that the first viola parts match the lower violin parts, the lower viola parts match the upper cello part, and the lower cello part matches the bass. So there is deliberate extra blending going on here.

But the point still remains. To me, this is like the differences in paintings I mention in Key and Psychology

Ok. So let me now tie this non-homogenity into the oboe in particular. But in a later post, I may show how the oboe also blends and how that had a historical importance.

Above I gave an example of nonhomgeniety between different members of the woodwind family: flute, oboe, clarinet, bassoon.

For the oboe, the timbres of the instrument vary greatly as well and that is a whole topic on its own.

However, within a single instrument played by a single person, there is variation between adjacent notes in a scale across the three or so octaves that the oboe spans.

To ease into demonstrating this, consider this pretty well-known example from Prokofiev's Peter and the Wolf:








This passage is in the lower part of the oboe's range. As I mentioned briefly for the flute, the lowest notes of a woodwind instrument are the "rawest", not blending, or are richest in overtones.

I suspect Prokofiev picked this low region for its honkiness and duck-like quality. As a friend sarcastically remarked: The versatility of the oboe (which to him always sounds like a duck) and the genius of Prokofiev combine to produce startling barn-yard effects!

Timbre changes in wind instruments are related in a way similar to the elements in the Chemical Periodic table; think of each octave of a flute or oboe as being a row in the periodic table. It has a row-like character or "heaviness". But pitches within an octave (or periodic table columns) also have a timbre similarity as well. (For the clarinet the term "octave" is changed to "register", because the relation at the lowest level is an octave and a fifth

Here is a guide to oboe timbres heard in this passage. The non-natural notes Eb and Db are more muted than their neighboring notes E natural and D. C, since it was the original lowest note of the instrument is also very honky, like a duck. In the last two notes of measure 4 where we jump up to D and B, some of the rawness goes away but you can hear traces of it.

If this is too involved to follow, just focus on the first four distinct pitches, E, Eb, D, and Db; see if you notice how the timber quality changes. In particular open E, muted Eb, Open D, muted Db.

The above was about the lower range of the oboe near Middle C. As for timbres in the last part of the solo: 



The sound of the upper C above middle C has less of that lower rawness and is more pure. The neighboring Eb while having the second octave characteristic still has some of the timber of its lower Eb counterpart, just less pronounced.  

 The Eb is a little more muted than say its surrounding notes E or D. Similarly Db which appears in the second measure is muted too (a little bit more so than the Eb). Bb while not as muted as the lower Eb is still muted compared to C. F is a little muted compared to G that follows. And notice the change in timbre between the C above Middle C and the Eb below it.

Here is the same passage that is heard an octave higher that appears later in this piece. Notice how the register timbre is different and a lot of the coarseness has been removed.



Towards the end of the passage, the oboe goes into the lower part of the 3rd octave above middle C and the upper part of the second octave of middle C. And here the coarseness is removed even more. (But again, I can hear the timbre changes between individual notes.)

I realize all of this may be hard to hear, and may seem like a wine connoisseur describing different kinds of wine profiles.

When I hear an orchestra, as a result of these slight timbre changes, I can pick out the specific notes that are being played, and thus calibrate my perfect pitch sense. I can do the same on hearing a flute as well. I should say that since I played both of these instruments in the past, these kinds of timbre changes have been ingrained.




Wednesday, September 18, 2024

Oboe Reed Enginering


On a Tuesday night hike, I started ranting about how the Oboe is the most engineering-like in the construction of its reed and how the reed can drastically change the timbre.

In this part, I'll mention a little bit about oboe reed construction. The timbre aspect is equally fascinating but that will be done separately.

Here is how an oboe reed is scraped in the style developed by Marcel Tabuteau and John Delancie around  1948-1951. 

Here is a back-lit image of this kind of reed.  Oboists who use this style of scrape, do hold the reed up to the light like this to see what's up.


In Tabeteau's reeds, the spine down the middle did not exist. However it is now generally added since it makes the read more sturdy.

This style is now popular; it has changed the oboe sound in the U.S. This style and sound then spread to Australia, Japan and Asia, Latin America, England (after a longer delay), and many countries. France and Germany are more steeped in their own traditions and styles and of sound,  so the spread there is less than in Europe.


Here is an image of the "French" style which was almost universally used before 1948:



The image is from Regency Reeds, a seller of oboe reeds from England. There generally is not a backlit view of these kinds of reeds because there is no need to. However, here is a diagram comparing the two



from https://oboewan.com/index.php/about-oboe-reeds/

Now here is a back-lit image of a slightly larger oboe used mostly in the Baroque period, called the Oboe d'amore: 


It uses the American scrape, so it is probably used on a modern instrument. The Oboe d'amore is slightly larger and is in A rather than C.  It is a transposing instrument. The current modern Oboe's lowest note goes down to Bb so you might wonder what's the big deal. But in the Baroque period, the Oboes went down only to middle C. And since they had only three keys (where one key was a symmetric duplicate of the other so one could play the instrument with the left hand or the right hand in the upper position), playing in a key like A major (and closely related keys) sounded very different when played on an Oboe in C instead of an Oboe tuned in A.


Now compare this reed with the reed for the smaller Baroque Oboe of the kind used back in the 18th century when the oboe was first developed from an older instrument called the "Shawm"



The Baroque Oboe is made of a softer wood -  boxwood instead of the harder grenadilla wood or plastic. And the bore of a Baroque Oboe is larger than its modern counterpart, so there is less air pressure on the player and there is more airflow that passes through the instrument.


Notice that as is the custom with the Oboe d'amore, English Horn (in F), or Bassson, there is no cork on the metal part or "staple". This reed is a little bit larger and wider than an oboe. So it is more forgiving than a modern oboe d'amore reed, which is more forgiving than a modern oboe reed.

You'll see that the baroque version in C is about the same size or larger. It is wider or more V-shaped at the tip that goes into the mouth. Also, no cork is on the staple.

Compare the oboe reed with a bassoon reed:




See how the bassoon reed is much wider and the cane portion down to the string portion is about twice as long. The bassoon plays about an octave lower than the oboe.

This means that tolerances are larger and can be more forgiving. It is influenced less by lip mass. The scrape is the French style. Typically there is a wire that can control the aperture of the reed opening as well as help to keep air from coming out the sides. Since there is less pressure, it is less likely to occur on the bassoon. One of the French-scrape Oboe reeds above also has a wire. Wires are typically frowned upon because, at the scale of oboe size, it can interfere with the tone. Moreso for American scrapes than the French scrape where there is more cane bark near the wire. But for a commercial maker such as this English firm, it is more practical to ensure there is no air leaking from the side.

About the cork on the staple. In modern oboes, there is a cylindrical reed well


from Yamaha's the Oboe: double reed mechanism

In the modern Oboe, not inserting the reed cork completely to the bottom of the well will cause a break in the conical bore wind flow at a place of higher pressure where it is more critical to change sound production. So, in contrast to most other instruments, you can't or don't "tune" the modern instrument other than what you can do by "lipping" the note up or down. (Lipping down is easier than up.) The oboe's inflexibility in tuning is one of the reasons the orchestra tunes to the oboe. (The oboist tunes the instrument by making the reed length and scrape so the pitch comes out correct.)

Other instruments of the orchestra are inflexible too, like keyboard instruments and tuned percussion instruments (xylophone and chimes). When the keyboard instrument is a soloist like in a concerto, the orchestra tunes to that instrument. However since xylophone and chimes are rare minor instruments, those are not used for tuning even though they are inflexible as well. (The oboe's clear or perhaps "piercing" sound also makes it good to tune to).


Thursday, June 20, 2024

Python in 2016 feels like C of 1996

(This originally appeared on Quora)

In 2016 I use Python a lot. I've been using it over 10 years or so. But even with that much use it still feels clunky. It reminds me of my experiences with C which go back even longer.

Let me explain.

There are people who feel that Python has revolutionized the ease with which they work. xkcd had an comic about it: xkcd: Python and I guess the pythonistas liked it so much that since Python 2.7 (and pypy 2.6) it has been in the standard library distribution that you can get to it via the antigravity module that was added since the comic came out.

And yes, many find Python an improvement over say earlier languages, much in the same way was felt about C. I can see how they feel that way. It is true that there is such a great body of libraries in the language, like C. Python has about everything.

Except elegance, and orthogonality.

But before diving into that, let me come back to C circa 1996. As I said that's exactly how people felt and I suppose still feel about C. For example, what is the most-used version of Python written in? C, of course. In fact, that version of Python is often called CPython.

Back in 1996, people were ranting about C, I guess rightfully, was because it was like the Fortran for systems programming. So I guess in a sense C around 1989 or so was the Fortran of 1959. And I'll go into the Fortran comparison just a little bit as well.

When I first started working at IBM research I worked with a guy, Dick Goldberg, who worked on the original Fortran project. Back then I felt Fortran was a little clunky too: there was no BNF language definition (but note that the B in BNF stands Backus, the guy that also was behind Fortran. Fortran came before this and this aspect was addressed in general, if not in Fortran).

And it had these weird rules like you couldn't index arbitrary expressions like x[i*j] but you could do some limited forms of that like x[2*i] or is it x[i*2]? And the reason for that was that back then they translated this kind of thing into IBM 360 assembler which allowed a memory index (a multiplication) along with a memory offset adjustment (an add) as a single instruction and the Fortran compiler could turn some of these into that form although it couldn't even re-associate 2*i to i*2 if that was needed.

But the big deal, as Dick explained to me, is that scientific programmers and numerical analysts didn't have to learn assembly language, because Fortran did a good enough job, on average, to obviate that need for the most part.

And then that brings us back to C. Fortran is not suitable for writing operating systems, or systems code and so around 1990 many Operating Systems were written in assembly language. Microsoft's OS's were. IBM's were as well.

But now for Unix, then minix and then Linux, you could write in the higher level language, the same as you could in Fortran for numerical or scientific software.

And libraries were written and to this day the bulk of complex libraries are still written in C: for regular expressions, for XML parsing, Database systems like mysql and postgres, and a lot of systems code. Even the guts of numpy, the numerical package for Python, has a lot of C code at its core and that is very common. You'll find other popular programming languages like Ruby, or Perl are written in C and their numerical packages fall back to C as well. Also the bindings that interface to those C systems.

Inelegance

Ok. So now let’s get back to inelegance in Python.

In Python, some functions are top-level functions like len(), hasattr(), type(), getattr() str(), repr() and so on, and some functions are methods off of an object instance, like the .join() or .append() methods of string and list objects, even though lists and strings are built-in types. So instead of writing in a uniform o.join('a').split('x').len().str() or str(len(split('x', join(o)))) you have to go back and forth depending on which arbitrary way it was decided for the function, e.g. str(len(‘o’.join(‘a’).split('x'))).

This is like learning in arithmetic expression which operators are infix, postfix, prefix and what the precedence is when there aren't parenthesis. And yes, Python follows in that tradition like many programming languages including C nowadays as well. This is in contrast to Polish prefix of Lisp, or the old HP calculators, Forth, or Adobe's Postfix.

I'm not suggesting languages change it. But I am saying it is more cumbersome. And when you extend that notion, it hurts my little brain.

C was notorious for extending that into "address of" (&) and "indirection of" (*), pre and post increment (e.g. ++) bitwise operators (|), logical operators (&&) and indexing ( [] ), and selection/indexing ( . , ->). Keeping in mind the precedence was so difficult that most people just used parenthesis even when not strictly needed.

I do need to back off the comment about top-level versus method functions in Python. But only a little...

"len" for example is a method off of some type like string or list. But it has these ugly double underscores around it, I think that was added to deter people from writing it in the more natural or uniform way like you would in say javascript.

Python does stuff like this: take something that would be usable but muck it up more to make it less usable. And there is a community of Python Nazis out there who will scream at you if you use x.__len__() however ugly it is instead of len(x).

These are the same kinds of people who insist that you should write:

import os

import sys

rather than the shorter:

import os, sys

When you ask why? They say just because. Well, actually they don't say it that way, they refer you to an acronym and number like PEP8, which is taken like legal law. Everyone has to do it that way. Fascism.

The Fascists say that doing things this way makes it easier to understand if people do things the same way. Really? Mankind has long been able to deal with variation in expression without any effort whatsoever. I can write "It is one thing" or "It is a thing" and most people don't obsess over whether it has to be "one" or "a". Actually, in English there is a certain style difference if I want to emphasise the singleness I might choose "one" over "a".

And so I'd argue that kind of nice subtlety is missing by the Python fascists. For my part, I'd prefer stipulating it would be okay to stipulate that every program imports "os" and "sys" and be done with that and get on with the rest of my life and the more important task of programming. The next best thing is just to string along that preamble crap in one line the way Jewish blessings always start "Baruch atoy adenoy eluhanu" (Blessed is God, King of the universe").

Oh, and by the way the "import" statements have to be at the top, (except that sometimes there are situations where they won’t work if they are at the top). That seems so 1980's Pascal like to me which required all constants, types and variables to be put at the top of the program. But in Pascal it was not for the same kind of fascism, but just that the Pascal grammar was limited in that way.

And while on the topic of Pascal, let me mention another aspect related to Pascal. Originally and to the end of Python 2, "print" was a reserved word. Very similar to Pascal's "println". And because it was a reserved word, you didn't put parenthesis around the arguments to print. I'm sure this was thought clever in the same way it was clever in Pascal. When Python 3 came about that changed so that print is now the more regular and uniform function call. Oh, but by then a decade-or-so-old code base then had to change.

Now you might think, "who knew"? Well, although Pascal had that feature (and most other languages including C just didn't), by the time the successor to Pascal, Modula, was developed the mistake was corrected. And here's the point: this was all done before or contemporaneous with Python's start.

It's one thing to make a mistake and correct it. But another thing to make the same mistake as was just fixed in another language, and then spend several years before you fix it the same way everyone else does. Pascal and Modula were developed in Europe, same as Python, so it's really no excuse not to have known about it.

Stubbornness and Doing things differently

So why is it that such things take so long to address? The Python language has been stubborn and unrelenting in its wrongness. I think Guido now agrees that indentation thing was a mistake. However there was a long period of time when its superiority was asserted. To me, the mistake is not so much about using indentation for a block begin, but the lack of adding even an optional "end" terminator.

But because of Python's indentation, printing python programs in print is subject to error since there isn’t the needed redundancy check. And more seriously, you can't embed that in another a templating language because the templating can mess up the fragile indenting. So if you are doing say web development in Django, you need to learn another language which of course doesn't have that indentation rule. Ruby programmers don't suffer that limitation and their templating systems use Ruby.

It also feels to me like stubbornness and arrogance that Python has long resisted using common idioms of other languages. (By the way, I sense that Dennis Ritchie and Guido Rossum are a little similar here). So the backtick notation of shell, Perl, and Ruby is not in Python. Nor for a long time was there simple variable interpolation. (In Ruby that's done via # rather than $ but the change was for technical reasons, not arrogance or a desire to do things differently).

In Python, how to do subprocesses has changed over the years, and still isn't as simple as backtick. But I suppose that has some cover because there are security considerations. Variable interpolation inside a string was also long resisted, although there was something like the C format specifiers inside a string with the '%'. But once Python added its own flavor of variable interpolation using the .format() method C style variable format specifiers are eschewed.

This kind of being the last on the block to give in to something that is common among other languages and then add it in a different way seems to be the Python way. A ternary if/else operator that C introduced and adopted in other languages is another example.

For those that want to work across multiple languages this kind of awkward originality just adds more warts.

Like Python, when C first came out, it had a number of unfamiliar things like all those post and pre operators, I guess the format specifier thing, variable arguments, the C preprocessor with its macro expansion. (Many of these weren’t strictly unique: you could find examples of these things elsewhere, such as in assemblers; but compared to say the high-level ALGOL-like languages and things like Cobol or Fortran, this was different). In C's case though they largely stuck to this and didn't change course. When developing Go, the mistakes of C were just fixed and so there I feel that Go is a very welcome upgrade to C.

Misc other things: comprehensions (not needed in better-designed languages), language drift, import statements.

Saturday, December 9, 2023

Another example of why NYC is a great place to live but I wouldn't want to visit.

Today is Santacon. 

It is the day when people young and old, big and small, dress up as Santa Claus. Or if that's not your thing, just wear felty or furry brown and put on a pair of felt antlers and that works too. 

I see such diversity in costume today, maybe just the hat, maybe just the pants, as I walk the 1/2 mile to the Saturday Farmer's Market. I look for basically two items.

The first item is brussel sprouts on the stalk. If the brussel sprouts are very fresh, you can eat the middle of the stalks. They have a carrot-like texture and the taste is a mild kind of horseradish with a slightly skunky smell that you find in cabbage or broccoli which brussel sprouts are related to. For the last month or so, they have been appearing at the Farmer's market.

The second item is concord grapes. I like them for their pissy taste. That taste comes from a native grape called the "fox grape" that is native to the northeastern coastal part of the U.S. 

I have never had a fox grape and would like to try it. Like raspberries or  low-bush blueberries, they are a bit fragile and do not transport very well. So people make jams with these. 

Historically, I imagine they are interesting for this reason. Since they are the native grape, I imagine that some ancient Viking real-estate developer hit on the idea of calling this area "Vinland" after discovering one if these. I imagine also that this was the same guy who previously had a successful real estate run after naming this other piece of land: "Greenland". There is a spot not far from NYC across the great psychological divide, the Hudson River, which one of his descendants named "Meadowlands". This is a double lie: it is a swamp. 

Anway, today I could not find either at the Farmer's market. I imagine it is this kind of specificness that people from NYC contributes to why others, not from the region,  may view us as "snobby". 

On the way back, I stopped into a large bookstore on Broadway. Yes, that Broadway; the bookstore is just not in the theater or show-business district. 

I have this weird idea that one day I will find a script to a Radio show I once heard by David Mamet and someone else that has music to the lyrics:


Hail to thee, George Topax

'ner against the foe lax,

Tibia quo pax,

Ee-i-ee-i-o.

          As we move along,

          through the goolagong, 

          we know where you belong,

          on your big fat throne, so .. 

          here's to thee, George Topax, ...


I don't find it. I check also the science and math section. And being a geek, the computer section as well. 


I see a book called "The Computer and the Brain" by John von Neumann. 



From the photo attached you can see a 1950's style computer, complete with wires on the front. Those large cylindrical things are some sort of memory.

I open it and read the table of contents. I am interested: PART I: THE COMPUTER. ... PART II: THE BRAIN

Von Neumann is, of course, responsible for the architecture of the modern-day computer with its CPU and memory.  It is not surprising to me that he was also interested in the brain. He was that kind of guy. And I find it cool and reassuring that right from the beginning, to his way of thinking he separates and distinguishes how a computer of his design works from how a brain works. As I look to see the price, and I come across this: 


$4.00 - I can afford that. And then I notice the writing. M. L M-something-or-other. Cambridge, Mass Nov. 1960. 

Then it hits me. ...  Could this be Marvin L. Minsky? The beginning matches, but I don't see anything that looks like the end: "sky".  I look on my phone to see if I can find a signature for Minsky. This is in the dingy and slightly damp basement of the Strand. No cell phone connection is available. I pass it by, and move on looking at other sections. 

But I figure: for $4, why not? 

So I go back and pick it up.  At the cash register I see that I should have looked at the back of the book, it was $20 not $4 which was the original price of the book. The $20 figure is because his book is "out of print". Yeah, I can understand why that might be.

Coming home, I look to find Minsky's signature. Signatures he writes in "signed copy" books do have a visible "sky". However I asked a friend who know more about handwriting analysis and says that the "M" and the "k" do look like they were formed the same way. 

According to Wikipedia, Minsky joined MIT (in Cambridge MA)  in 1958, the same year as this 3rd printing. The book's topic is clearly a topic he was interested in. 

What do you think? 

Anyway, it was his. That's my story -- and I am sticking to it!

To come back to the title, here is an example of what is cool about NYC.  I am just kind of lumbering around on a routine walk and stumble across something like this. On another occasion, pretty much that same route, I was walking home and passed a theater that said: "World Premiere". It was Samuel Beckett's penultimate play, Worstward Ho I thought now there's something you don't see every day (a phrase Edgar often says to Chauncy in the Rocky and Bullwinkle cartoon) . But then I thought. Well, I am tired tonight. I'll go tomorrow night, and did. 

Another time I was walking around, about 3 blocks from my apartment on an Easter Sunday, and there is no one out on the street except this guy, Philip Glass who I later learned lives here.

As for the "but I wouldn't want to visit" part, the touristy kinds of things tend not to thrill me. 

I did however like going up the top of the  World Trade Center because of inside the view was very myopic and all four compass sections are very very different. And then you could go to the top and see how it all pieces together. But to really appreciate something like this you have to know something about the area.


Saturday, October 28, 2023

The two times I have been misled by Compiler Canon (and the Dragon book)

Twice in my life there have these weird moments in compilers where I realize something I have been taught, something that has been considered canon turns out to be misleading.

The time this happened was when I started at IBM Research in a compiler group. Previously my formal knowledge of compilers was from the first edition of the dragon book.

Register allocation focus was on reducing the stack depth in evaluating expressions. There was something called Sethi-Ullman numbers which computed the maximum stack depth.

Since one of the authors of the Dragon Book was Jeff Ullman, it is not a surprise would be mentioned.

And for the DEC PDP-11 machines this made sense, because there were stack-like increment/decrement instructions on a special "stack-pointer"register. It never made sense for IBM's computers which had general-purpose registers without such increment/decrement instructions.

When I got to IBM around 1982, there was this register allocation implementation by Greg Chaitin based on an idea by Russian Computer Scientist A. P. Ershov via IBM Fellow John Cocke. While the idea had been floating around since 1971 or so by John Cocke, Greg's simple elegant implementation it happen and put it into the PL.8 compiler.

Subsequent editions of the Dragon book ditched Sethi Ullman numbers and described this instead. 

The second time though is recent in relation to how compilers work and how decompilation works.

In around 2006, I was interested in decompilation as a way to get precise location information about where a program is when it is running code, such as inside a debugger, or just before producing a stack dump. There was some code that was started around 2000 and largely abandoned a year or two later after going through two people. Subsequent maintainers I don't think understood it as well since it has a compiler-centric orientation.

There were a couple of people who wrote thesis on this topic, and their work seems to be canon for decompilation. It assumes a general-purpose decompiler -- one that starts with machine code and produces source code. It has no understanding of the source code used, translator, or compile system used. Therefore its source code has to be some sort of lower-level language bridgings the semantics of machine-code-level instructions and control flow. C is a good language for this kind of thing.

The decompiler-as-a-kind-compiler approach however is at odds with this. It does assume a specific compilation system and its transformations into a high-level bytecode. And by doing this the source code is the same as the source code we started out with. Furthermore, by doing this we don't need really to understand the semantics of the bytecode instructions!

At first I thought this was kind of a one-off. Part if this was thinking that bytecode as a general principle is a one off. I now realize, that this indeed is not the case. As the Wikipedia article on p-code-machine explains, the idea goes back as early as 1966 and traces through Euler and Pascal via Niklaus Wirth. Rather than bytecode being a one-off. It is used a a number of systems like Smalltalk, and Java, probably gaining popularity from the Pascal P-code interpreter, UCSD Pascal and Turbo Pascal.

There was this school of thought at IBM is that there was a holy grail of a universal source-code and machine independent register-based intermediate language. Something akin to LLVM. To some extent there  still is a school of though that believes this. And there is possibly some truth to this. However there are a lot of systems that use a bytecode designed for a particular language for a particular family of languages.  Microsoft CIL and JVM are examples of a family of languages camp.

See https://www.youtube.com/watch?v=IT__Nrr3PNI&t=5311s for the history of JVM and how it was influenced by bytecode.


Friday, November 26, 2021

Sysadmin Horror Story with a Moral

Of all of my horror stories working for a NYC financial organization, the worst one was also the most memorable, not because of its horribleness, but rather because there was something that I discovered which has served me well in the decades since.

Back in the days before server clouds and virtual machines, we had real computing hardware.

Sometimes, as in this financial institution, the servers, all 1,000 or so of them, were housed right inside the business. This firm owned an entire 24-floor building not far from Rockefeller Center.

We were informed by NYC's power company, Con Edison, that in order to beef up our growing power demand they needed to turn off the power to the building for a little bit. It might have just been to the two or three floors where we housed all of the computers. This would be done on a Saturday night when all of the financial markets were closed.

The plan was that Saturday evening we'd power down all of the servers and Sunday morning at 6AM the servers would be powered back up. Because every server had to be checked, we divided the power-on phase into two groups: one starting at 6AM, and another starting at noon. We had the more experienced people doing the noon shift, because we expected that some servers might be tricky, so the early group could focus on quantity and leave the more difficult machines for the more advanced group.

This seemed pretty reasonable and clever. I was on the second shift.

Things went pretty well and we got most of the servers up and running.

Except there was one. And it was one of the more important real-time trading servers.

So now we flash back six weeks earlier...

Back in the days before the widespread use of server configuration management and orchestration tools like kubernetes, ansible, chef, or puppet, we had to write our own scripts to do mundane things.

One time I was assigned the task of writing a script to update root passwords everywhere.

Actually the assignment wasn't really that. A batch of sysadmins had burned out and quit and upper management wanted to just make sure ASAP that they didn't inflect damage on us. Financial intuitions weren't friendly places to work.

Taking the longer view of things, I wrote a general program to change root passwords. You weren't going to change the attitude of the financial institution. They treated everyone like a disposable wipe. So you adjusted; this was an activity was going clearly going to happen over and over again.

Part of the task of was, of course, to write the program to update the root password. The other part of the task, do it everywhere, was actually more difficult: get and maintain an inventory of servers. A number of the servers you could get from a service developed at Sun Microsystems called NIS.

However not all servers used this, so NIS information had to be melded in with DNS. The two lists were largely distinct because servers were set up two different and somewhat competing groups. However the groups got merged sometime after I was hired.

When a server has more than one network interface, it was important to catch that. I had written blast script that would take a list of servers, copy some code to that server and run it. The code here was the thing that changed the root passwords.

One of these servers had more than one interface, was in listed both in DNS and NIS and so it was listed twice. My blast script updated servers independently rather than serially.
When you have 1,000 or so servers to update, you need to update servers in parallel if you want to get any task done in a single work shift. (More about that aspect perhaps some other time; that story has a moral, too.)

There is a funny thing about the Unix file system (and by extension Linux as well): consistency is not guaranteed over independent file writes. In other words, if you have two clients each of which opens the same file and writes to it in parallel with the other client, the result will probably be garbage. And that's what happened when I ran my password update in parallel through two different network interfaces.

The result was that we had a shadow password file which was garbage.

Much later when I learned about "idempotency" in configuration management, a light bulb went off in my head. See definition 3 in https://foldoc.org/idempotent .

Now, pretty soon we discovered the mistake. I went to upper management to report the problem. I figured, since we were low on sysadmins which is why I was doing this in the first place, my job was safe for a little while, at least until I got the program working reliably by adding a couple of file locks into the program, and updating the inventory for the mistake.

Then I was asked what the current situation of the server was. Was the important trading system still trading?

Interestingly the answer was, well, actually yes. It was. Any program that started out with root privileges and that was running was still functioning normally. It was only new requests that couldn't log in as root.

So upper management said, don't worry about it then: just don't reboot the box.

This is exactly the highly skilled 2-steps-ahead kind of answer typical of those minds that work at places making decisions about your financial future.

So now we come back to the problem that Con Edison had decided that we were going to reboot. Most of the important servers were hooked to battery backup with specially monitored power regulators. However this one server was in the trading area on some guy's desk hooked into the wall outlet.

And as we learned that day, it also had disk clustering of the root file system which was also slightly dysfunctional. The process of rebooting involved using a magical command as root which we were only to find out about later. I hope you see where we are going with this...

If you are still with me, now I get to the main part of the story.

We spent hours trying all sorts of things and the usual tricks to bring up a server in repair mode. But because of the root partition being part of disk clustering which was messed up, we couldn't. All the other folks were told to go home because it was just this one server and it was just me and the sysadmin team lead.

We had a contract with the manufacturer/vendor for support of the important servers, but not for this particular server which wasn't racked with other servers. Finally, the sysadmin team lead, Sam, decides to put on his own personal credit card 1/2 hour live support for about $800. (This was in 1999 dollars). I was impressed: it was resourceful and took guts.

We were told that we would deal with the expert in Veritas Cluster Management, our clustering software, at Sun.

So we were hopeful.

He had us try a few thing things, but none of them worked. And then after 3 minutes of elapsed time, he said, "I hope you have good backups to restore from." Well, predictably, this important but rogue trading server that happened to be sitting on some dude's desk and had neither fault-tolerant power nor server support maintenance wasn't in the list of servers to be backed up either.

But Sam just said: I want a second opinion.

The guy said: Okay, I hope your résumés are up to date.

I was very agitated. Using a low-level disk reading command, dd, I was able to see that the all the clustering info was there. It was just a matter of reconstructing whatever information is needed at boot so we could then get to a single-user root shell with /etc/ mounted. To spend $800 to be told in 3 minutes to basically blow it all away and restore everything from backup (which would have taken several hours and lost some amount of trading data) really irked me.


But Sam kept cool. We talked to the manager who explained that this guy really was the top-of-the-line expert in Veritas Clustering that they have.

Sam patiently said this wasn't acceptable and that we still had 15 minutes of paid time to figure out something. The manager said well, in 5 minutes, at midnight, the shift changes and we have another guy on duty. He's just the guy we use to fill in off hours when there is low activity. And he doesn't know that much about Veritas Clustering. Sam said, okay, let's try him.

We wait the 5 minutes and talk to the other guy. He affirms that the person we talked to really was the expert on Veritas Clustering. He didn't know much about it, but he will look up whatever resources he has available and call us back. Okay. And in another 5 minutes he says well, by googling he sees that you can get direct support for Veritas Clustering from the Veritas company itself. Sun Microsystems was just an OEM provider.

So we called up Veritas, and gave them the dd information I had gotten earlier. With that, someone in Veritas was able to write a custom shell script which we were able to recreate the volume manager information with all of the current data in tact, so that we could boot the server to at least a root shell. YAY!


Moral: It is sometimes more important to find someone willing to help and work with you than someone who has the most knowledge about something. With someone trying to be helpful, you might be able to figure the things you don't know, things that an expert might overlook.
 

Aftermath:

While we now had root access we still had a lot of work to recreate shadow passwords and go over the /etc/ filesystem. We had both been there over 12 hours. Sam had to get some rest before doing this important, unforgiving and custom bit of work. And this was wise, too, because when you are tired it is too easy to make a mistake. So Sam was going to go home and sleep to 5AM.

For my part, I just slept on the wall-to-wall carpeted floor in the sysadmin room. The carpeting was there because this was the floor with the trading room, not because the company cared about the comfort of sysadmins sleeping on the floor.

At about 3AM I hear the locked door open. It was the night guard who said he was suspicious because he heard some heavy snoring. I assured him I heard it, too, and was certain it was coming from the next room over.