...making Linux just a little more fun!
Ramanathan Muthaiah [rus.cahimb at gmail.com]
Fri, 22 Dec 2006 06:09:23 +0530
Gang,
Recently, there were some discussions at my workplace regd packet losses in WAN interface. And then, one folk in IT dept came up with output of "ping" command to highlight that there are no packet losses.
Am sure, this is not the correct way to measure packet losses.
I feel, they should be monitored over a period of time at the gateway router and the traffic in this router should be analysed for dropped packets / timeouts.
Is this true ?
NOTE: Am not working in the IT dept but one of the affected parties.
/Ram
Kapil Hari Paranjape [kapil at imsc.res.in]
Fri, 22 Dec 2006 09:29:33 +0530
Hello,
On Fri, 22 Dec 2006, Ramanathan Muthaiah wrote:
> Recently, there were some discussions at my workplace regd packet > losses in WAN interface. And then, one folk in IT dept came up with > output of "ping" command to highlight that there are no packet losses. > > Am sure, this is not the correct way to measure packet losses.
True enough. "ping" is/was a useful way to check that the link is up and get a rough idea of round-trip times. That's it.
Kapil. --
Benjamin A. Okopnik [ben at linuxgazette.net]
Thu, 21 Dec 2006 22:36:38 -0600
On Fri, Dec 22, 2006 at 06:09:23AM +0530, Ramanathan Muthaiah wrote:
> Gang, > > Recently, there were some discussions at my workplace regd packet > losses in WAN interface. And then, one folk in IT dept came up with > output of "ping" command to highlight that there are no packet losses. > > Am sure, this is not the correct way to measure packet losses. > > I feel, they should be monitored over a period of time at the gateway > router and the traffic in this router should be analysed for dropped > packets / timeouts. > > Is this true ?
In theory, the router itself should have on-board tools to measure this - I can't think of anything else that would do the job. The problem is that traffic on the WAN side is meaningless for those purposes - you don't know what part of it is supposed to be routed into the LAN - and the traffic on the LAN side doesn't tell you anything, since you don't know how many packets were supposed to get through. I've heard of traffic analyzers (hardware) that are supposed to let you troubleshoot that kind of problems, but they're supposed to be extremely expensive.
The best bet is to find out if your router has a telnet interface (many of them do), log in, and snoop around. If it's got an 'ifconfig' command, that might go a long way toward answering your question.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Benjamin A. Okopnik [ben at linuxgazette.net]
Thu, 21 Dec 2006 22:54:53 -0600
On Fri, Dec 22, 2006 at 09:29:33AM +0530, Kapil Hari Paranjape wrote:
> Hello, > > On Fri, 22 Dec 2006, Ramanathan Muthaiah wrote: > > Recently, there were some discussions at my workplace regd packet > > losses in WAN interface. And then, one folk in IT dept came up with > > output of "ping" command to highlight that there are no packet losses. > > > > Am sure, this is not the correct way to measure packet losses. > > True enough. "ping" is/was a useful way to check that the link is up > and get a rough idea of round-trip times. That's it.
I've also found that it can be used, with the '-f' option, to stress-test a suspect 10Mb/S NIC (I've never tried it on a faster network; _caveat geek._) "su -c 'ping -f hostname'" is a perfectly lovely way to entertain yourself on a network full of crap hardware, and a good way to produce reports that show why you need new cards.
Ram, if you can convince your IT folks to OK it, you might want to try this - do note that it'll probably DoS your router while you run it. It's not a complete diagnostic - if it doesn't fail does not mean that it's good - but if you get serious packet loss from doing it, then there's most likely a problem. 30 seconds of ping-flooding ought to be more than enough.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Predrag Ivanovic [predivan at ptt.yu]
Fri, 22 Dec 2006 15:18:07 +0100
On Thu, 21 Dec 2006 22:54:53 -0600 Benjamin A. Okopnik wrote:
<ping>
> > I've also found that it can be used, with the '-f' option, to > stress-test a suspect 10Mb/S NIC (I've never tried it on a faster > network; _caveat geek._) "su -c 'ping -f hostname'" is a perfectly > lovely way to entertain yourself on a network full of crap hardware, and > a good way to produce reports that show why you need new cards.
Ben, would using 'ping -f' on localhost produce any meaningful results? I would like to stress-test NIC on my machine, I'd like to see if it is responsible for (sometimes very high) packet loss. Look at this 2 consecutive ping runs:
--- [1] ping www.yahoo.com PING www.yahoo-ht2.akadns.net (209.73.186.238): 56 octets data 64 octets from 209.73.186.238: icmp_seq=0 ttl=51 time=179.6 ms 64 octets from 209.73.186.238: icmp_seq=2 ttl=51 time=155.8 ms 64 octets from 209.73.186.238: icmp_seq=3 ttl=51 time=135.9 ms 64 octets from 209.73.186.238: icmp_seq=4 ttl=51 time=138.1 ms 64 octets from 209.73.186.238: icmp_seq=7 ttl=51 time=144.8 ms 64 octets from 209.73.186.238: icmp_seq=9 ttl=51 time=152.9 ms 64 octets from 209.73.186.238: icmp_seq=10 ttl=51 time=192.8 ms 64 octets from 209.73.186.238: icmp_seq=11 ttl=51 time0.9 ms 64 octets from 209.73.186.238: icmp_seq=13 ttl=51 time=174.9 ms 64 octets from 209.73.186.238: icmp_seq=15 ttl=51 time=128.4 ms 64 octets from 209.73.186.238: icmp_seq=16 ttl=51 time=142.1 ms 64 octets from 209.73.186.238: icmp_seq=18 ttl=51 time=144.8 ms 64 octets from 209.73.186.238: icmp_seq=19 ttl=51 time=164.3 ms --- www.yahoo-ht2.akadns.net ping statistics --- 21 packets transmitted, 13 packets received, 38% packet loss round-trip min/avg/max = 128.4/158.1/200.9 ms [2] ping www.yahoo.com PING www.yahoo-ht2.akadns.net (209.73.186.238): 56 octets data 64 octets from 209.73.186.238: icmp_seq=0 ttl=51 time=234.2 ms 64 octets from 209.73.186.238: icmp_seq=1 ttl=51 time=165.9 ms 64 octets from 209.73.186.238: icmp_seq=2 ttl=51 time=149.1 ms 64 octets from 209.73.186.238: icmp_seq=3 ttl=51 time=168.4 ms 64 octets from 209.73.186.238: icmp_seq=4 ttl=51 time=162.9 ms 64 octets from 209.73.186.238: icmp_seq=5 ttl=51 time=184.9 ms 64 octets from 209.73.186.238: icmp_seq=6 ttl=51 time=179.3 ms 64 octets from 209.73.186.238: icmp_seq=8 ttl=51 time=192.4 ms 64 octets from 209.73.186.238: icmp_seq=12 ttl=51 time=177.4 ms 64 octets from 209.73.186.238: icmp_seq=13 ttl=51 time=191.4 ms 64 octets from 209.73.186.238: icmp_seq=14 ttl=51 time=185.6 ms 64 octets from 209.73.186.238: icmp_seq=15 ttl=51 time=122.1 ms 64 octets from 209.73.186.238: icmp_seq=16 ttl=51 time=158.8 ms --- www.yahoo-ht2.akadns.net ping statistics --- 17 packets transmitted, 13 packets received, 23% packet loss round-trip min/avg/max = 122.1/174.8/234.2 ms ---If NIC is OK, next to check would be cable modem i.e coaxial cable that goes from modem to NIC. Since I crimped that cable, and it is with twist-on connectors, *it is* possible that it causes the trouble. (sometimes, resetting the modem/reattaching the coax helps). NIC uses via-rhine kernel driver(VIA Rhine PCI Fast Ethernet driver). I didn't mean to hijack the thread, but this somehow seems related ;)
Pedja
-- Complicated == Learning Experience -- Joe Bowman
Benjamin A. Okopnik [ben at linuxgazette.net]
Fri, 22 Dec 2006 09:11:08 -0600
On Fri, Dec 22, 2006 at 03:18:07PM +0100, Predrag Ivanovic wrote:
> On Thu, 21 Dec 2006 22:54:53 -0600 > Benjamin A. Okopnik wrote: > > <ping> > > > > I've also found that it can be used, with the '-f' option, to > > stress-test a suspect 10Mb/S NIC (I've never tried it on a faster > > network; _caveat geek._) "su -c 'ping -f hostname'" is a perfectly > > lovely way to entertain yourself on a network full of crap hardware, and > > a good way to produce reports that show why you need new cards. > > Ben, would using 'ping -f' on localhost produce any meaningful results?
I'm afraid not; if you consider it in terms of the OSI 4-layer model, the ICMP packets would be routed to 'lo' at layer #2, and would never get as far as the hardware interface. You need an actuall remote host to "talk" to so your packets can go out through the NIC, onto the wire, and into the other NIC (and a short way into the other host's stack, where it'll get ACKed.)
> I would like to stress-test NIC on my machine, I'd like to see if it is > responsible for (sometimes very high) packet loss. > Look at this 2 consecutive ping runs: > ---
[ snippage ]
> 21 packets transmitted, 13 packets received, 38% packet loss
[ snipperoonie ]
> 17 packets transmitted, 13 packets received, 23% packet loss
Your NIC is certainly a part of that chain, but I don't know that it would be the first thing I'd suspect - there's probably a fair amount of hardware between you and Yahoo. Have you tried running 'traceroute' on those addresses? That can be nicely instructive, in a bit of visual detail.
> If NIC is OK, next to check would be cable modem i.e coaxial cable that goes from modem to NIC.
If you're going to go one step at a time, I'd first look at the RJ-45 cable connecting the NIC to the modem (assuming that you've already run the host-to-host test that we were just talking about.) Having another system available would also allow you to definitely determine whether everything up to the modem is OK or not: if another host, with its own RJ-45 patch cable, is still flaky, then it's somewhere between the modem and the end you're pinging.
The other advantage of 'traceroute' is you're going to see all the bars and houses of ill repute that your packets are going to visit before they get where they're going. You could ping each of those IPs in turn (going from closest to most remote), and watch the loss ratio. In most cases by far, though, the problem is local - and it's usually either the patch cable from the modem to the router (or host), or from the cable drop to the modem.
> Since I crimped that cable, and it is with twist-on connectors, *it is* possible that it causes the trouble. > (sometimes, resetting the modem/reattaching the coax helps).
Oh yeah. If you made it yourself, and you're not fairly experienced with RG-8 (or RG-6 which I prefer), it may well be the source of the problem. As a former boss of mine, a lab manager at Hughes Aircraft, had scrawled on the whiteboard for a rather clueless MMW engineer (who had tried running an IF signal via a piece of wire, and was wondering why it wasn't working), "400MHz is NOT DC!"
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Predrag Ivanovic [predivan at ptt.yu]
Sat, 23 Dec 2006 18:22:57 +0100
On Fri, 22 Dec 2006 09:11:08 -0600 Benjamin A. Okopnik wrote:
<snip>
> Your NIC is certainly a part of that chain, but I don't know that it > would be the first thing I'd suspect - there's probably a fair amount of > hardware between you and Yahoo. Have you tried running 'traceroute' on > those addresses? That can be nicely instructive, in a bit of visual > detail.
I ran a traceroute -v -n to ISP.
--- root@deus:/usr/ports/pedja/mathomatic#traceroute -v -n ptt.yu traceroute to ptt.yu (212.62.32.65), 30 hops max, 40 byte packets 1 172.17.248.1 36 bytes to 172.19.224.147 10.593 ms 7.322 ms 158.193 ms 2 213.137.109.113 36 bytes to 172.19.224.147 12.411 ms 497.248 ms 124.917 ms 3 213.137.99.193 36 bytes to 172.19.224.147 12.509 ms 12.468 ms 10.190 ms 4 213.137.97.173 36 bytes to 172.19.224.147 43.376 ms 71.164 ms 46.475 ms 5 213.137.107.97 36 bytes to 172.19.224.147 22.555 ms 30.928 ms 21.622 ms ---
and to yahoo.com(just the relevant part below).
--- root@deus:/usr/ports/pedja/mathomatic#traceroute -v -n yahoo.com traceroute: Warning: yahoo.com has multiple addresses; using 66.94.234.13 traceroute to yahoo.com (66.94.234.13), 30 hops max, 40 byte packets 1 172.17.248.1 36 bytes to 172.19.224.147 20.192 ms 42.007 ms 5.684 ms 2 213.137.109.113 36 bytes to 172.19.224.147 10.098 ms 29.912 ms 56.426 ms 3 213.137.99.193 36 bytes to 172.19.224.147 11.681 ms 25.007 ms 12.274 ms 4 213.137.97.173 36 bytes to 172.19.224.147 36.362 ms 25.909 ms 33.405 ms 5 213.137.107.125 36 bytes to 172.19.224.147 15.544 ms 26.807 ms 13.219 ms ---
There is a difference in hop 5, as you can see, but I don't think its relevant in any way. I may have mentioned that few months back cable users got moved behind the proxy, no more public ip addresses( Windows machines+net 24/7=disaster).So, I ran traceroute on proxy(ip address that ip-lookup.net claims I use):
--- root@deus:/usr/ports/pedja/mathomatic#traceroute -n -v 213.137.109.129 traceroute to 213.137.109.129 (213.137.109.129), 30 hops max, 40 byte packets 1 172.17.248.1 36 bytes to 172.19.224.147 18.481 ms 5.958 ms 7.685 ms 2 213.137.109.113 36 bytes to 172.19.224.147 13.821 ms 21.541 ms 20.453 ms 3 213.137.109.114 36 bytes to 172.19.224.147 36.690 ms 16.368 ms 7.293 ms 4 213.137.109.113 36 bytes to 172.19.224.147 16.869 ms 14.729 ms 14.616 ms <snip> 28 213.137.109.113 36 bytes to 172.19.224.147 40.092 ms 42.882 ms 83.277 ms 29 213.137.109.114 36 bytes to 172.19.224.147 87.868 ms 30.703 ms 44.451 ms 30 213.137.109.113 36 bytes to 172.19.224.147 68.395 ms 61.603 ms 65.174 ms ---
and ping:
--- root@deus:/usr/ports/pedja/mathomatic#ping 213.137.109.129 PING 213.137.109.129 (213.137.109.129): 56 octets data 64 octets from 213.137.109.129: icmp_seq=0 ttl=61 time=18.5 ms 64 octets from 213.137.109.129: icmp_seq=1 ttl=61 time=34.3 ms <snip> 64 octets from 213.137.109.129: icmp_seq=11 ttl=61 time=19.6 ms 64 octets from 213.137.109.129: icmp_seq=12 ttl=61 time=111.8 ms --- 213.137.109.129 ping statistics --- 13 packets transmitted, 13 packets received, 0% packet loss round-trip min/avg/max = 14.3/38.7/111.8 ms ---
I'm not sure how to interpret traceroute output, though.
> > If NIC is OK, next to check would be cable modem i.e coaxial cable that > > goes from modem to NIC. > > If you're going to go one step at a time, I'd first look at the RJ-45 > cable connecting the NIC to the modem (assuming that you've already run > the host-to-host test that we were just talking about.) Having another > system available would also allow you to definitely determine whether > everything up to the modem is OK or not: if another host, with its own > RJ-45 patch cable, is still flaky, then it's somewhere between the modem > and the end you're pinging.
RJ-45 cable came included with the modem(Motorola SB 5101E, btw), and I tried another cable/modem combination, same thing happens, so it's rather safe to conclude that problem is something else, I think.
> The other advantage of 'traceroute' is you're going to see all the bars > and houses of ill repute that your packets are going to visit before > they get where they're going. You could ping each of those IPs in turn > (going from closest to most remote), and watch the loss ratio. In most > cases by far, though, the problem is local - and it's usually either the > patch cable from the modem to the router (or host), or from the cable > drop to the modem. > > > Since I crimped that cable, and it is with twist-on connectors, *it is* > > possible that it causes the trouble. (sometimes, resetting the > > modem/reattaching the coax helps). > > Oh yeah. If you made it yourself, and you're not fairly experienced > with RG-8 (or RG-6 which I prefer), it may well be the source of the > problem.
I googled 'crimping for dummies', and made the cable several times, but without proper tools/quality cable/connectors, I really can't be sure, can I? btw, tools etc are on my shopping list, any recommendations(dealers, manufacturers...)?
> As a former boss of mine, a lab manager at Hughes Aircraft, had > scrawled on the whiteboard for a rather clueless MMW engineer (who had > tried running an IF signal via a piece of wire, and was wondering why it > wasn't working), "400MHz is NOT DC!"
<laugh> Many engineers at my workplace are like that, they make a friend of mine (he has an engineering mindset, lots of clue but not a formal degree) absolutely mad, particularly 'I saw it work in a book, so it must be right' attitude. Ben, I'd like to thank you for your time and informations, it's been fun as always, and I learned something in the process
Pedja
-- You can lead an idiot to knowledge but you cannot make him think. You can, however, rectally insert the information, printed on stone tablets, using a sharpened poker. -- Nicolai
Benjamin A. Okopnik [ben at linuxgazette.net]
Sat, 23 Dec 2006 12:43:00 -0600
On Sat, Dec 23, 2006 at 06:22:57PM +0100, Predrag Ivanovic wrote:
> > I ran a traceroute -v -n to ISP. > -- > root@deus:/usr/ports/pedja/mathomatic#traceroute -v -n ptt.yu > traceroute to ptt.yu (212.62.32.65), 30 hops max, 40 byte packets
The right thing to do from there - assuming you suspect the problem is somewhere upstream - is to ping each of the hosts along the way and watch for losses. Again, though, chances are high that your problem is local.
> and ping: > --- > root@deus:/usr/ports/pedja/mathomatic#ping 213.137.109.129 > PING 213.137.109.129 (213.137.109.129): 56 octets data
[snip]
> --- 213.137.109.129 ping statistics --- > 13 packets transmitted, 13 packets received, 0% packet loss > round-trip min/avg/max = 14.3/38.7/111.8 ms
Note that the primary problem isn't there anymore - all your packets came back.
> I'm not sure how to interpret traceroute output, though.
What I normally do is watch the display - if it shows '*' for an intermediate host, either that host isn't answering or the packets are being lost. In any case, the most useful part of it is being able to see that chain of hosts: you can now ping them in turn and see where the loss starts.
> RJ-45 cable came included with the modem(Motorola SB 5101E, btw), > and I tried another cable/modem combination, same thing happens, so it's > rather safe to conclude that problem is something else, I think.
Agreed. Although one time I got surprised by two bad cables in a row, and oh brother did that frustrate the hell out of me! (Also had a similar experience with two bad NICs.)
So, assuming that your patch cable is good, it's down to the modem, the coax, the jack, or the upstream cable/plant. I'd focus on the first two.
> I googled 'crimping for dummies', and made the cable several times, but without proper > tools/quality cable/connectors, I really can't be sure, can I? > btw, tools etc are on my shopping list, any recommendations(dealers, manufacturers...)?
I got religion about crimping tools a long time ago, and Paladin and Klein share the godhead. There are a few lesser deities out there as well (I used a Ziotek, which has a mildly sucky stripper, for ~1000 RJ-45 connections, and it did a pretty good job), but those two always work - even with dicey connectors.
> >As a former boss of mine, a lab manager at Hughes Aircraft, had > > scrawled on the whiteboard for a rather clueless MMW engineer (who had > > tried running an IF signal via a piece of wire, and was wondering why it > > wasn't working), "400MHz is NOT DC!" > > <laugh> > Many engineers at my workplace are like that, they make a friend of mine > (he has an engineering mindset, lots of clue but not a formal degree) > absolutely mad, particularly 'I saw it work in a book, so it must be right' attitude.
We had a fellow at Hughes - can't rememer his name now, but a very sweet and friendly Chinese MMW scientist - whom I caught one day while he was trying to jam an SMP connector into a E-band waveguide, grinding away at the job vigorously and looking puzzled at their failure to fit (E-band WG has a rectangular hole about 1/8"x1/16"; an SMP connector is ~5/8" diameter hex nut.) I took it away from him - he had managed to grind off the gold plating on both; the waveguide, at least, would have to be replated, and the solid coax would have to be remade - and explained that
1) Waveguides carry the RF. 2) Coax carries the IF/modulation (so far, so good - he was nodding along.) 3) To get the IF signal off the RF carrier, you needed a pickup - i.e., a Gunn or an IMPATT diode mounted in a waveguide with a coax takeoff. He knew this intellectually, but had somehow failed to connect the books with physical reality.I then took a pickup from a stack of them *on his desk*, 6" away from his elbow; got him another, non-destroyed piece of waveguide; bolted the whole thing up with the 10-24 machine screws and nuts - of which he had a bin, same as there was on every desk in that lab - and hooked it up to the power meter. He was effusively grateful, and promised to remember the process from then on.
Brilliant MMW designer, by the way, and excellent at estimating, "by feel", the size of a resistive waveguide insert for a given dB drop (which is Deep Black Magick, I assure you.) He got sniped by a headhunter from TRW later, and last I heard - mid-90s - was making over 200 big ones a year.
> Ben, I'd like to thank you for your time and informations, it's been fun as always, > and I learned something in the process
[grin] Always happy to assist with the bits that I know; I'm getting the same thing in exchange, and figure that to be a more than fair deal.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *