...making Linux just a little more fun!
By Jim Dennis, Jason Creighton, Chris G, Karl-Heinz, and... (meet the Gang) ... the Editors of Linux Gazette... and You!
From Rick Moen
Answered By: Jay R. Ashworth, Ben Okopnik, Sindi Keesan
I've been paying a little closer attention to SMTP errors, since the migration of LG's public mailing lists. Here's one for TAG subscriber Sindi Keesan. (I just received one each of these following Deepak and Ben's posts to the "TFTP problem" thread, and undoubtedly will get another for this one.)
From MAILER-DAEMON Thu Jun 02 11:19:23 2005 From: Mail Delivery System <[email protected]> To: [email protected] Subject: Mail delivery failed: returning message to sender Date: Thu, 02 Jun 2005 11:19:22 -0700 This message was created automatically by mail delivery software. A message that you sent could not be delivered to one or more of its recipients. This is a permanent error. The following address(es) failed: [email protected] Unrouteable address [goes on to provide a copy of the undeliverable list post]
A lot of us have become accustomed to calling these "bounces" and disregarding them because they're so often cryptic and impenetrable. (My SMTP server, in general , gives pretty clear diagnostic messages, and yet this one was obscure to me, too.) Sometimes, the pedants among us distinguish Delivery Status Notifications (DSNs) from "bounces", where the former are three-digit SMTP-standard error codes and matching explanatory text, generated by the remote SMTP host (MTA process) during an SMTP conversation.
In this case, there's none of that "550 User unknown" or similar DSN stuff, and I was left curious what "Unroutable address" means, here -- especially since Ben and others have a pretty high opinion of Stephen M. Jones's "SDF Public Access UNIX System, INC." operation at freeshell.{org|net}.
I was intending to attempt a manual SMTP session with that system (by telneting into its mail exchanger (MX). The first step, then, is to ask the public DNS where freeshell.org's MXes are:
[rick@linuxmafia] ~ $ dig -t mx freeshell.org +short ; <<>> DiG 9.2.4 <<>> -t mx freeshell.org +short ;; global options: printcmd ;; connection timed out; no servers could be reached [rick@linuxmafia]
Hmm. Can that be right? No nameservers can be reached that are authoritative for the domain? First, let's cross-check to make sure I'm getting meaningful results for similar queries on other domains (i.e., that I don't just have network or DNS-access problems of my own):
[rick@linuxmafia] ~ $ dig -t mx apple.com +short 30 eg-mail-in1.apple.com. 10 mail-in3.apple.com. 10 mail-in4.apple.com. 10 mail-in5.apple.com. [rick@linuxmafia] ~ $ dig -t mx linuxmafia.com +short 10 linuxmafia.com. [rick@linuxmafia] ~ $
Yep, that's all looking good. Let's see what IPs are listed as freeshell.org's authoritative nameservers in the whois servers:
See attached whois-output.txt
Er, I might be missing something, but having all of one's nameservers be in-domain seems like a bit of a hazard. Sure, the top-level nameservers will also have their IPs as part of the DNS's "glue records", but the rest of us won't. And having only two nameservers is a bit thin.
[Ben] Indeed, it is a hazard; the few times that SDF has gone down, it was like being shifted sideway into an alternate universe in which it had never existed. The response from web browser, fetchmail, etc. amounted to "Freeshell? What's a Freeshell? Go away, you silly man - we have no time for psychotics with an overactive imagination."
As I've found out while researching my response to Jay, I think we've now found part of the reason for Stephen's problem: His third nameserver is getting ignored (not used), because of obsolete glue records in his parent zone. He needs to fix that.
It would be nice if Sindi Keesan or someone else whose domain name doesn't have "linux" in it would advise Stephen of that, and gently lead him by the hand to the www.dnsreport.com test CGI -- as that gives a nice overview of his problems (and, basically, a checklist).
[Sindi] I would be happy to send along a message to him from my address here if you tell me exactly what to say and where to address it to.
This reminds me of when someone knowledgeable at my local bbs figured out why the electric company's online billing site went in little circles, but they did not want to hear about it. (Nor were they interested in the fact that Spamassassin was dumping their enormous emailed bills for five different reasons, including green fonts, too many images, odd looking subject line or from, too much HTML, and 'porn', and the mails were too large to receive at my address without Spamassassin).
OK, try this:
According to the report on
http://www.dnsreport.com/tools/dnsreport.ch?domain=freeshell.org ,
your third nameserver (ns-c.freeshell.org) isn't in the authoritative
list in the .org records, even if you have it in the zonefile. Because
of that, it probably won't get DNS queries about freeshell.org, which
may partially explain the outage we had recently.
Also (as mentioned in that report), "freeshell.org." in your zonefile's SOA record is wrong, and probably should be "ns-a.freeshell.org."
That report also make some sensible-sounding suggestions about timeouts to tweak in the SOA record, which you might consider.
[Sindi] Who do I send this to? I am not very familiar with sdf, just use it for email and website.
Re-checking my first post to this thread:
See attached whois-output.txt
The indicated e-mail address appears to be that of Stephen M. Jones, proprietor.
Suggestion: "whois" is your friend.
[Sindi] I sent it to the address below with a short preface stating that a 'friend' suggested I pass along this information. Thanks. This situation reminds me of that of a friend whose daughter will not accept email from him if it is properly spelled - she knows he is dyslexic and insists that he write it himself, so when I write it for him we have to send it from his address not mine, and make sure to introduce some spelling errors. (And then her 8 year old mysteriously sends perfectly spelled emails 'all by himself'.)
[Jay] Two nameservers is indeed a bit thin... but on the other point, unless my understand of DNS is also thin, the parent nameserver is always going to hand you the glue, is it not?
It's going to hand you the glue records if it has them. One of the reasons I like the "DNS Report" test at http://www.dnsreport.com is that it shows you, by implication, the immense variety of ways to screw up one's DNS -- and one of them is to have missing or incorrect glue records in the parent zone. Recommended facility, anyway.
[Jay] And in return, nice tip. :-)
Very cool site. I wonder if he has a version that returns something more easily parseable, by, say, Nagios. Or, alternatively, will make his script available. Must look closer.
In fact, if you look closely at http://www.dnsreport.com/tools/dnsreport.ch?domain=freeshell.org , it is evident that Stephen M. Jones did at some point deploy a third nameserver, but that it's missing from the parent records, which explains some of his fragility problems. (He also has some minor SOA errors.)
[Jay] I was a touch surprised, though that you didn't demonstrate the handy-dandy "+trace+ option to dig, which I fell in love with the minute I found it:
Neat. Here's the result for linuxgazette.net:
See attached dig-trace-output.txt
I haven't done nearly enough playing around with new DNS tools: I'm one of those codgers who've been hanging onto nslookup and sulking about its ongoing demise. Thank you for pointing out that trick!
[Jay] I'm trying to figure out a reasonable way to automate running it and looking for changes; it's not quite tuned for that. Perhaps the dnsreport code would be easier to use that way.
[Jay] It automatically traces the domain down from the root, showing you the salient information at each step of the way; the important bit in this case was:
> freeshell.net. 172800 IN NS ns-a.freeshell.org. > freeshell.net. 172800 IN NS ns-b.freeshell.org. > ;; Received 82 bytes from 192.12.94.30#53(E.GTLD-SERVERS.net) in 81 ms > > ;; reply from unexpected source: 65.32.1.80#53, expected > 192.94.73.20#53 > ;; Warning: ID mismatch: expected ID 31090, got 36862 > ;; reply from unexpected source: 65.32.1.80#53, expected > 192.94.73.20#53 > ;; Warning: ID mismatch: expected ID 31090, got 36862 > ;; Received 31 bytes from 192.67.63.37#53(ns-b.freeshell.org) in 45 ms
Note that a) it thinks those servers are in freeshell.org, not .net, and b) that it appears that neither of them are answering the phone.
You can see that it did get an answer, though I'm a touch irked at dig that it didn't tell us what that answer was. The 65.32 servers are the customer resolver servers for Road Runner TampaBay, which is my uplink; why it saw fit to answer for itself I don't know -- clearly, since I did this from a Linux box, it should not have even been being asked...
But it appears still not to be running; perhaps the gent gave up?
Looks like he had a one-day outage, and is back.
[Jay] Well, good. We don't need that sort of service much anymore, but those that need it... need it.
Here's the list from my own domain:
Domain servers in listed order: NS1.LINUXMAFIA.COM 198.144.195.186 NS.PRIMATE.NET 198.144.194.12 NS1.VASOFTWARE.COM 12.152.184.135 NS.ON.PRIMATE.NET 207.44.185.143 NS1.THECOOP.NET 216.218.255.165
For some reason, my Tucows / OpenSRS registration lists the authoritative nameservers' IP addresses in the public DNS, while Stephen M. Jones's doesn't. I'm not clear on why this is.
Anyhow, that's at least something to go on. Let's find out what IP addresses the authoritative nameservers have:
[rick@linuxmafia] ~ $ host NS-A.FREESHELL.ORG NS-A.FREESHELL.ORG has address 192.94.73.20 [rick@linuxmafia] ~ $ host NS-B.FREESHELL.ORG NS-B.FREESHELL.ORG has address 192.67.63.37 [rick@linuxmafia]
Well, at least that much of his DNS is working.
[Ben] Wouldn't that be your DNS that's working? Unless I'm mistaken, "host" uses your /etc/resolv.conf to look up hosts - unless you specify another DNS server explicitly.
Well, DNS being the distributed system that it is, you're always using some client piece and some server piece. But what I meant is that at least that much of his DNS information is working (accessible and useful). It might very well have been cached in my or some other non-authoritative nameserver's records, yes. But I was wanting to fetch his authoritative nameservers' IPs from somewhere -- anywhere -- so that I could ask them questions directly, as the next step. Nothing like getting DNS answers straight from the horse's mouth, if you don't mind the rather unsanitary metaphor.
Let's ask the nameservers explicitly by their IP addresses, to make double-sure the query's going to the right place:
~ $ dig -t mx freeshell.org @192.94.73.20 +short ; <<>> DiG 9.2.4 <<>> -t mx freeshell.org @192.94.73.20 +short ;; global options: printcmd ;; connection timed out; no servers could be reached [rick@linuxmafia] ~ $ dig -t mx freeshell.org @192.67.63.37 +short [rick@linuxmafia] ~ $
How odd. Looks to me like the first nameserver doesn't respond, and the second returns some sort of null result. Just out of old-fogydom, and as a cross-check on "dig", let's do the same query using nslookup (a tool that's now deprecated, in general):
[rick@linuxmafia] ~ $ nslookup -query=mx freeshell.org 192.94.73.20 ;; connection timed out; no servers could be reached [rick@linuxmafia] ~ $ nslookup -query=mx freeshell.org 192.67.63.37 Server: 192.67.63.37 Address: 192.67.63.37#53 ** server can't find freeshell.org: SERVFAIL [rick@linuxmafia] ~ $
[Ben] I believe that "host" is the recommended replacement for "nslookup" these days; I groused a bit about having to learn its syntax, but it's quite nice once you do. It's a sort of a cross between "dig" and "nslookup":
host -t mx freeshell.org 192.67.63.37 freeshell.org MX 50 smtp.freeshell.org
Ah, it appears that I've underestimated the thing. Thanks.
In DNS lingo, SERVFAIL means that the domain does exist and that the root name servers have information on it, but that its authoritative name servers are not answering queries about it. So, basically Stephen M. Jones has one of his two nameservers offline, and the other misconfigured to the point that it stutters and faints when you ask it questions.
I hope it's a temporary glitch, but this (among other things) points out why continuity of DNS service is so important, and why two nameservers really aren't quite enough.
http://www.dnsreport.com/tools/dnsreport.ch?domain=freeshell.org is also interesting, giving an overview of just how much is broken here (a lot ) . (The freeshell.net variant of the domain has the same problem for the same reasons.)
Ben mentioned on a private mailing list that Stephen's a good guy and performs a generous service to the public but for reasons of personal experience loathes Linux. I vaguely remembered when that came about, and have re-found the rant he posted at the time, which still makes interesting reading:
http://web.archive.org/web/20010712145226/http://www.lonestar.org/sdf/
[Ben] Yeah, that was what I'd based my statement on; that, and the dance of boundless joy that he performed on the Freeshell list after the move was done (it's archived there, but it's not web-accessible AFAIK.)
I'll drop Stephen a brief, polite heads-up e-mail. I hope he won't mind my address. ;->
Stephen goes into a little more detail about his circa-2001 disenchantment with Linux, here:
http://mail-index.netbsd.org/port-alpha/2001/12/28/0008.html
Oh, what the heck, I might as well just quote it, because it's still relevant:
From [email protected] Sun, 15 Jul 2001 13:59:04 -0700 Date: Sun, 15 Jul 2001 13:59:04 -0700 From: Rick Moen [email protected] Subject: [CrackMonkey] How come I am just hearing about this?
begin Bob Bernstein quotation:
> Found on the netbsd-advocacy list: > http://www.lonestar.org/sdf/
It's got to really suck, being sysadmin of a public-access Unix system: You'll have an ungodly number of careless users, plus you have to worry about attacks both from arbitrary remote locations and attacks both by your users and by outsiders masquerading as legitimate users. When the day comes of you suddenly realising that your site has for some time been massively compromised, more often than not, you have only surmises about how entry and compromise occurred.
Numerous of Stephen Jones's statements suggest that such was the case with freeshell.org (aka "SDF"):
> I'm even thinking of just removing telnet/ftp/pop3 all together...
Plaintext-authentication network access to shell accounts: check.
> ...we might as well had our passwords in plain text as LINUX's use of > encryption is about twice as good as Microsoft's.
Unless I misremember very badly how the login process works, passwords are not processed in kernelspace. Some Unixes introduce a PAM layer, while others do not. Some support MD5; others do not. But the irony is that Stephen's users did have their passwords in plaintext -- every time they did telnet/ftp/pop3!
In fact, it's obvious that Steven's brand new to security-mindedness:
> I've never felt security was important because I sincerely thought > that a public system would be sacred ground to anyone be they a > cracker or just normal user.
Poor bastard. I'd say he's had a rude awakening, except that I think it's not even begun, yet.
His July 11 note suggests that he was still running Linux kernel 2.2.18, which has been security-obsolete for a dog's age. (Note exception: Some distribution provide kernels with nominally earlier version numbers that have been patched to have the fixes introduced in nominally later kernels.)
Stephen writes: > I blame LINUX due to the recent unveiling of the "oh, if root wasn't ^^^^^^ > already easy to get, here is an easy way" bug in the execve() system > call...
But that ptrace/execve() race condition (note: a local-user exploit) was not recent at all: It was a long while back. Wojciech Purczynski reported it to BugTraq on 2001-03-27:
http://www.securityfocus.com/templates/archive.pike?list=1&mid=171708&_ref=539250975
> ...where malicious code an be executed via almost any binary. ^^^^^^^^^^^^^^^^^
As Purczynski says, any SUID binary. But the point is that all this is very old news, 3+ months old. And extremely well known (as the "ptrace exploit").
Now, it seems likely that Stephen was still running an non-ptrace-patched 2.2.18 (or earlier) Linux kernel when the shit hit the fan, proving if nothing else did that he was asleep at the wheel -- but it's also clear that his system was security-exposed in a multitude of other ways, AND that he still is. (Example: He hasn't yet firmly deep-sixed telnet, POP3, and ftp inbound access mechanisms exposing shell passwords to the Net. He will.)
Let's do a taxonomy of root-compromise attacks (as opposed to DoS attacks and other categories): Rarely, these might be compromises of daemon processes or kernel network stacks from remote -- e.g., against vulnerable releases of BIND v. 8.x, lpr, or wu-ftpd. If the attack is not one of those, it must involve acquiring user-level access first, and then attacking the host from inside, impersonating a legitimate user. (In other words, the compromise of root authority is either from outside the host, or inside. Inside is much easier.)
The latter category breaks down further, according to how the attacker arranges to impersonate a legitimate user, into sniffing versus other. Sniffed passwords are, of course, what you get with standard deployments of telnet, non-anonymous ftp, and POP3 daemons -- and are a particularly ignominious way to get compromised. Stephen is only now thinking of shutting off this possibility -- so I fear he has other hard lessons yet to come.
The other ways of compromising shell-account passwords all trace back to the fact that users are pretty much always the weak element. If you let them, they'll use the same weak password everywhere. If you assign passwords yourself, change them at intervals, and remove the SUID bit from /usr/bin/passwd -- and sternly admonish the users not to expose their passwords through re-use on other systems -- they'll still do the latter, because they can, and because you can't stop them. Switch to one-time pad authentication, and they'll store the pads or seeds on vulnerable systems. And of course they'll ssh in, thinking that's unconditionally secure -- from compromised systems where attackers are logging all keyboard activity. And, remember, it takes only one user's shell access getting compromised. (Or, of course, the attacker might just sign up for a user account.)
You might be able to prevent system security from being shot in the foot by your users by requiring them all to use physical security dongles (e.g., SecureID) plus one-time pads. Maybe. But not on a public-access Unix system. In that sense, Stephen is screwed.
How so? Because he's doomed to having attackers occasionally get user-level access -- and protecting root against local users is much more difficult. While the remote attacker can fruitfully attack only running your network daemons and network stacks, as a local user he can attack any security-sensitive binary on your system -- a much wider field of targets. Stephen can try to keep installed software current, remove some, remove SUID/SGID from others, recompile using StackGuard, implement a capabilities model / ACLs, keep selected subtrees on write-protected media, and so on.
And he'll still get clobbered, from time to time. Odds are, he won't even be aware of compromise for quite a long time (as he probably wasn't, this time). Does he have IDSes set up? Of course not! He's "still against security". But that will change. Papa Darwin is a good, if ungentle, teacher.
NetBSD 1.5.1 on Alpha is going to be an eminently suitable system for him (even though the Alpha is doomed over the longer term). And Nick used to deliberately keep an Alpha on-line with a very vulnerable, antique OS load, just for the amusement value of watching x86-oriented kiddies' canned attacks crash and burn on it.
But Stephen has a longer-term problem, and it has nothing to do with kernel vulnerabilities -- let alone old ones that he should have long ago patched.
[Sindi] Thanks for pointing out to me that this thread has something to do with why sdf was gone yesterday, but it is way beyond my ability to understand. Could you summarize in a few sentences what this all means, for a beginning linux user with no computer training since Fortran IV?
As The Doctor says, "Ah, that takes me back -- or is it forward? That's the problem with time travel; you can never tell." ;->
(I likewise cut my teeth on FORTRAN, in my case playing around on university mainframes.)
If my post seemed a little bit meandering, it was because I was chasing down several things:
The first question fascinated me because, of late, I've taken a particular interest in understanding SMTP-protocol mutterings -- and have improved my own machine's articulateness in that area -- and yet the quoted advisory ( issued by my machine's MTA) was about as clear as mud.
So, the answer to turned to be: "I, the lists.linuxgazette.net SMTP process, couldn't even attempt delivery at all, because the destination domain's DNS is completely non-functional.
That also furnishes the answer to question #2: There was no DSN conversation because my MTA couldn't even look up the detination IP, let alone talk to freeshell.org's SMTP host.
And answer #3 was: "It's unclear whether SDF's main services themselves went down, because SDF's nameservice outage made those unreachable by name, even if they were still running."
So, as I was saying, continuity of DNS service is really, really important to Internet services, and Stephen M. Jones's DNS for freeshell.org (as presently configured) has proven to be fragile. And thus your (recent) problem.
I sent a short, polite head-up advisory to Jones, about his DNS outage. He didn't respond, but I gather from your post that he must have fixed it!
[Sindi] Thanks for you explanation which I hope I understood.
I had Fortran in high school, back before our high school even owned a computer. We typed out our programs on yellow paper tape which was sent to the other high school to run. In college our physical chemistry professor decided to teach us some useful skills, so in addition to learning to solder together a crystal radio, we ran programs on punch cards also in Fortran. In grad school it was still punch cards (one audited course on Pascal). The rest is self taught.
I think time actually goes in circles so that everything is new every once