At work our CTO has started doing "Software Life Stories". The idea is he will bring in someone who has had a career as a programmer to talk to us over lunch. For our first installment he brought in his friend Tom Cargill.
Tom has been a programmer for longer than I've been alive and had a wealth of information to share. One question I asked him was if he could point to specific things he did that made him a better programmer One of the answers he gave was that he understood what he was working on at one level deeper than the current problem he was solving. For example, he said while working on debuggers he understood the machine code beneath the language he was writing a debugger for.
As part of Tom's career he worked at the Bell Labs. While there he
helped invent the proc filesystem. He mentioned this in passing saying
that it helped him build a debgger. I had heard of proc before but
didn't know anything about it. A few days later I was checking in on
the health of our servers and applications. The script we use to do
that had identified a problem. To understand what might be going on I
first needed to understand what the script was doing. While reading I
saw "/proc/$pid/cwd"
. Shoot there's proc again. Time to take Tom's
advice and go one level deeper to understand what the heck it is. Or
at the very least scratch the surface of the next level.
This
article
seems like a good overview of procfs. Take that from someone who
doesn't know anything about procfs so everything in there could be a
lie. The high-level view is it dumps kernel datastrcutures to files so
applications can learn more about what is going on in the system. So,
the "/proc/$pid/$cwd"
above is getting the current working directory
of $pid
. Interesting! Systems that know about themselves and you can
ask questions of are one's I like working on.
I messed around some with the examples in the article. This one jumped out to me:
~$ cat
# In another terminal window
~$ pgrep -x cat
187843
~$ echo foo > /proc/187843/fd/0
# foo will appear in the other terminal window
Cool. So, we can write to fd/0 (STDIN) of the process. Since that process is cat it writes out to STDOUT whatever is input on STDIN. I wonder if we can read from STDOUT?
~$ tail -f /proc/187843/fd/1
# In another terminal window
~$ echo foo > /proc/187843/fd/0
# crickets in the window running tail...
Hmm, nothing is ever quite as easy as I hope it will be. I don't understand why I can't just listen to STDOUT but it has something to do with the fact that cat is connected to a tty. In an effort to not go too deep I'll save that rabbit hole for another day.
As I said earlier Tom mentioned that that he helped to invent this filesystem. I looked up the paper about the invention of /proc. It is shockingly readable and short for a technical paper so I definitely think it is worth a read.
If you happen to be trying to run the examples on your own machine you
may have found that /proc
doesn't exist. On some flavors of Unix
there is no /proc
. This got me curious about the reasons why. It
seems that it can be
unsafe. FreeBSD had it at one
point but removed
it. There
is much hand waving aobut the reasons why it was removed. My takeway
from that is that to build anything mission critical on top of proc
I'll need to understand it a bit more and some of the race conditions
that can crop up when accessing the files in there*.
That's all for today. I'll call it 5% of a full level deeper.
* While looking into procfs I came across a really neat YouTube video of how file descriptors can be used to prevent race conditions when accessing files. I rarely watch YouTube videos about coding but that video is worth a watch and makes me want to see more of his videos.