diagnosis - Diagnostic Steps Les Cottrell SLAC Presented at...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
http://sdu.ictp.it/lowbandwidth/ Diagnostic Steps Les Cottrell – SLAC Presented at the Optimization Technologies for Low-Bandwidth Networks, ICTP Workshop, Trieste, Italy, 9-20 October 2006 http://www.slac.stanford.edu/grp/scs/net/talk06/diagnostics.ppt Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Slide: 2 Les Cottrell, SLAC Get ready Bring up terminal window so can try some commands Bring up the presentation so can click on links: www.slac.stanford.edu/grp/scs/net/talk06/diagnostics.ppt
Background image of page 2
Slide: 3 Les Cottrell, SLAC Aim Goal: provide a practical guide to debugging common problems Why is diagnosis difficult yet important? Local host Ping, Traceroute, PingRoute Looking at time series Locating bottlenecks Correlation of problems with routes More tools and problems Where is a node Who do you tell, what do you say? Case studies and More Information
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Slide: 4 Les Cottrell, SLAC Why is diagnosis difficult? Internet's evolution as a composition of independently developed and deployed protocols, technologies, and core applications Diversity, highly unpredictable, hard to find “invariants” Findings may be out of date Measurement/diagnosis not high on vendors list of priorities Resources/skill focus on more interesting an profitable issues Tools lacking or inadequate Implementations are flaky & not fully tested with new releases
Background image of page 4
Slide: 5 Les Cottrell, SLAC Add to that … Distributed systems are very hard A distributed system is one in which I can't get my work done because a computer I've never heard of has failed . Butler Lampson Network is deliberately transparent The bottlenecks can be in any of the following components: the applications the OS the disks, NICs, bus, memory, etc. on sender or receiver the network switches and routers, and so on Problems may not be logical Most problems are operator errors, configurations, bugs When building distributed systems, we often observe unexpectedly low performance the reasons for which are usually not obvious Just when you think you’ve cracked it, in steps security Firewall, NAT boxes etc. Block pings, traceroute looks like port scan, diagnostic tool ports are blocked … ISPs worried about providing access to core, making results public, & privacy issues
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Slide: 6 Les Cottrell, SLAC Sources of problems Host “errors” TCP buffers, heavy utilization … Duplex mismatch (Ethernet) Misconfigured router/switches Including routing errors, especially for backup paths Bad equipment, wiring/fiber problem Congestion
Background image of page 6
Slide: 7 Les Cottrell, SLAC Fire: Local Host Usual Unix tools ( uname-a, top, vmstat, iostat …) Is the host overloaded, do you have a gateway ( route ), name server ( nslookup/dig ), which interface are you using (
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 37

diagnosis - Diagnostic Steps Les Cottrell SLAC Presented at...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online