home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Powered by InformationWeek Business Technology Network
InformationWeek 500 Conference -- September 14-16, 2008 Registed Today!

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers



ON THE WIRE / BILL ALDERSON AND J. SCOTT HAUGDAHL


The Case Of The Good Guy, Bad Guy

Q: Our latest sleuthing efforts find us in the midst of a Microsoft-centric network.

In this case, our client's typical workstation is running Windows for Workgroups with NT Server as the primary network operating system. The underlying network infrastructure consists of several 16-Mbps Token-Rings supporting thousands of users. Rather than implement the usual Windows for Workgroups NetBIOS that runs directly on top of Logical Link Control (LLC), our client chose the IP "route" (no pun intended) by implementing TCP/IP in the workstations and servers. Thus NetBIOS traffic that spans network segments doesn't have to be source routed or transparently bridged as in most NetBIOS implementations, but rather, it is routed via IP. The standard for NetBIOS over TCP/IP is contained in RFCs 1001 and 1002.

In NT Server, the Server Message Block (SMB) protocol is used to communicate the file and print services with a Windows for Workgroups workstation. SMB and NetBIOS often go hand-in-hand, as is the case here. However, there are notable exceptions such as SMB found in VINES and Pathworks.

Back to our client's problem: In one network segment, Windows for Workgroups users can't save data to network drives.

Scott: Often, one of the first steps in troubleshooting a network problem is to pinpoint the manifestation of the problem, rather than accept the typically vague explanation of, "Uhh, like, I can't save a file."

Bill: I n this case, we determined that a user could log into a server, run an application from that server and load a file from that server. However, about 15 seconds after the user attempted to save data, the connection to the drive mapping on the server was no longer valid.

Scott: Curiously, saving small files around 1 KB or so worked well; only writing the larger files had the problem.

Bill: Not only that, but some users on nearby rings had no problem at all.

Scott: Rather than spend the rest of our lives examining and comparing Windows .INI files, we decided to insert into the net and do a little protocol analysis.

Bill: We set up a packet trace on the workstation in question. Everything looked normal--the login, the application load, the file reads and even the file writes--as long as the write block sizes were small.

Scott: However, when the application attempted to write a block size of, say, 2 KB, there was no SMB acknowledgment from the server.

Bill: Instead, the workstation's TCP layer would retry about a half a second later with the same request, then one second later, two seconds later and so on, for six total attempts before giving up. In other words, there were five retries, each exponentially increasing in retry time for a total of close to 16 seconds before dropping the connection.

Scott: Since there were several retries with no response from the server, and other workstations were still communicating with the server successfully, we then suspected a problem in the communications path between the workstation and server.

Bill: The Data Link Control (DLC) addresses showed that the problem workstation was sending and receiving its packets across the ring to and from a router. Interestingly, this network had redundant routers and the first three attempts were sent to the primary router and the last three to the secondary router. The IP addresses reveale d the true end points, that is, the workstation and server.

Scott: Further, we could tell from the IP hop count that there were three hops, or routers, between the source and destination. The packet may have been lost at any of these points.

Bill: As long as we were already analyzing, we decided to examine a series of transactions from another workstation reading and writing data to the same server from another ring.

Scott: This other workstation never had a problem writing large blocks of data. Further analysis revealed that this station talked to a different router, and that its maximum packet size was typically 1,500 bytes.

Bill: Backing up a "bit" and analyzing a login and file load/save sequence from the good workstation and router combo gave us the answer we were looking for.

Scott: OK! Don't hold our readers in suspense any longer. Is it the workstation? The router? The LAN? Sunspots?

Bill: The problem lies in the routers. Each router connects to an intervening serial link. If a packet's maximum transmission unit (MTU) exceeds a certain size, the "good guy" router port returns an Internet Control Message Protocol (ICMP) packet indicating that the destination is unreachable because fragmentation is required.

Scott: However, the "bad guy" router port didn't bother informing the sender of this little problem, thus literally "dropping the ball" or packet, altogether.

Bill: Naturally the sender begins to worry in a short time (about half a second) and sends the first retry packet. This retry packet has the same problem at the router, causing subsequent retries until the workstation finally gives up.

Scott: Looking inside one of the workstation's packets shows that the "don't fragment" (DF) bit is set, meaning that the router should not attempt to fragment the packet. This causes the router to generate an ICMP error packet back to the sender. In this case, the MTU of the serial link was set at 1,500 bytes. Since the Token-Ring station could potentially generate up to a 17,800-byte packet, the router must either fragment the packet for the user or, as in this case, inform the workstation that an MTU problem exists and that the packet was dropped.

Bill: Right. So in the case of a router about to route an IP packet that exceeds the MTU size of the next segment with the DF bit set, the router must send back an ICMP to the sender. When the station receives the ICMP DF discard packet notice, in this case it automatically adjusts its MTU downward, resends, gets through and from then on, uses a smaller MTU when writing larger block sizes. In this case, an SMB write of a larger block size causes the station to generate a NetBIOS continuation packet for the remaining data.

Bill: We checked all the router configuration parameters and software version and couldn't detect any differences between the good router port and bad router port.

Scott: So what was the solution?

Bill: At this point, we called the router vendor. The vendor dialed into the router in question. The vendor "couldn't find anything wrong" and "didn't change anything." The next day, the router "magically" began working properly, by handling the DF bit with an ICMP response.

Scott: Gee, the vendor must have hired Rod Sterling from The Twilight Zone for tech support.

Bill: Fortunately, our traces proving that the router had this problem did not disappear into the Twilight Zone!

Bill and Scott are principals of Pine Mountain Group. They can be reached at otw@pmg.com. Portions of the actual trace files from selected columns are available in the Network Computing On the Wire CompuServe forum (GO OTW) or via Pine Mountain Group's Home Page on the Web at http://www.pmg.com . These files are copyrighted by Pine Mountain Group, and may not be us ed for any commercial purposes.






Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Download Today
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems & TechnologyInsurance & TechnologyWall Street & TechnologyAccelerating Wall StreetBank Systems & Technology Executive SummitBuyside Trading SummitInsurance & Technology Executive Summit
space
Microsoft Technology Network
MSDN MagazineTechNetThe Architecture Journal
space
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights