Difference between revisions of "Category:Linux troubleshooting"

From Christoph's Personal Wiki
Jump to: navigation, search
(Linux networking)
 
(41 intermediate revisions by the same user not shown)
Line 1: Line 1:
This category will contain a collection of articles on '''troubleshooting Linux'''. It will be ''highly'' biased towards [[SuSE]] Linux (version 10.1 or later), as that is my primary OS. I am also using a 64-bit (x86_64) kernel, so my articles will also be biased towards these systems.
+
This category will contain a collection of articles on '''troubleshooting Linux'''. It will also include a lot of random commands I use for troubleshooting. It will be ''highly'' biased towards Red Hat-based (e.g., [[CentOS]])) and Debian-based (e.g., [[Ubuntu]]) distros, but most of the commands should work on most Linux distros. I am also using a 64-bit (x86_64) kernel, so my articles will also be biased towards these systems.
  
''Note: Most of the following have also been tested on Mandriva Linux.''
+
{{Disclaimer-linux}}
 +
==Emergency reboot==
 +
see: [[wikipedia:Magic SysRq key]]
  
== System information ==
+
Alt + SysRq + '''REISUB''' ("Raising Elephants Is So Utterly Boring"; execute in ''slow'' succession)
% dmesg
+
% vmstat  # to quickly monitor CPU, memory, and I/O usage and decide which is the bottleneck
+
  
  % ps -ef | egrep '^root ' | gawk '{print $2}'  # method 1
+
*For the above to work, you must have it enabled first:
% pgrep -u root                                # method 2
+
  echo "1" >/proc/sys/kernel/sysrq
  
% cat /proc/cpuinfo
+
In order to have it always enabled (e.g., after a reboot), one must in enable it in the <code>/etc/sysconfig/sysctl</code> file (in [[SuSE]], at least):
% cat /proc/partitions
+
  ENABLE_SYSRQ="yes"
  % cat /proc/meminfo
+
For further information see <code>/usr/src/linux/Documentation/sysrq.txt</code>
% cat /proc/sys/vm/swappiness  # number from 0 - 100; the higher the number the more the system will swap
+
% cat /proc/interrupts  # inspect your /proc/interrupts file for multiple devices having the same interrupt
+
  
% uname -a  # system architecture
+
==System information==
  % grep ^VERSION /etc/SuSE-release  # To see which SuSE Linux version you are using
+
  $ dmesg
% cat /etc/mandriva-release        # To see which Mandriva Linux version you are using
+
  $ iostat
% dmesg | head -1                  # full version info.
+
  $ vmstat # to quickly monitor CPU, memory, and I/O usage and decide which is the bottleneck
  % cat /proc/version                # full version info.
+
  % cat /etc/issue                  # display Linux distribution
+
% pstree
+
% lsof | grep TCP                  # list open files
+
% lsof | grep ' root ' | awk '{print $NF}' | sort | uniq | wc -l # list number of open files for a user
+
  
  % getconf    # print system configuration variables
+
  $ ps -ef | egrep '^root ' | gawk '{print $2}' # method 1
  % getconfig  # get configuration information for the Xorg server
+
  $ pgrep -u root                                # method 2
  % systool    # view system device information by bus, class, and topology
+
% dmidecode  # DMI table decoder
+
% biosdecode  # BIOS information decoder
+
% bind -P    # print keyboard bindings
+
  
  % cat /proc/scsi/scsi
+
  $ cat /proc/cpuinfo
 +
$ cat /proc/partitions
 +
$ cat /proc/meminfo
 +
$ cat /proc/sys/vm/swappiness  # number from 0 - 100; the higher the number the more the system will swap
 +
$ cat /proc/interrupts  # inspect your /proc/interrupts file for multiple devices having the same interrupt
 +
 
 +
$ lspci # lists all PCI buses and devices connected to them
 +
$ lsusb # lists all USB buses and any connected USB devices
 +
$ lshal # lists all devices the hardware abstraction layer (HAL) knows about (should be most hardware on your system)
 +
$ lshw  # lists hardware on your system, including maker, type, and where it is connected
 +
 
 +
$ uname -a  # system architecture
 +
$ cat /etc/issue                  # Display distribution and version (on some distros)
 +
$ lsb_release -a
 +
$ grep ^VERSION /etc/SuSE-release  # To see which SuSE Linux version you are using
 +
$ cat /etc/mandriva-release        # To see which Mandriva Linux version you are using
 +
$ dmesg | head                    # full version info.
 +
$ cat /proc/version                # full version info.
 +
$ cat /etc/issue                  # display Linux distribution
 +
$ pstree
 +
$ lsof | grep TCP                  # list open files
 +
$ lsof |grep ' root ' |awk '{print $NF}' |sort|uniq|wc -l # list number of open files for a user
 +
$ lsof -i :22 # list all connections via port 22 (i.e., ssh)
 +
$ strace 'command'                # trace system calls and signals for a given command
 +
 
 +
$ getconf    # print system configuration variables
 +
$ getconfig  # get configuration information for the Xorg server
 +
$ systool    # view system device information by bus, class, and topology
 +
$ dmidecode  # DMI table decoder
 +
$ dmidecode -s system-product-name
 +
$ dmidecode -s system-manufacturer
 +
$ biosdecode  # BIOS information decoder
 +
$ bind -P    # print keyboard bindings
 +
 
 +
$ acpi -t    # Check current battery charge system temperature (package might not be installed by default)
 +
$ finger -l  # Display information about all system users
 +
 
 +
$ cat /proc/scsi/scsi
 
   WDC WD2000JD-22H Rev: 08.0
 
   WDC WD2000JD-22H Rev: 08.0
 
   SATA-I, 200 GB, 150 MB/s, 8 MB Cache, 7200 RPM
 
   SATA-I, 200 GB, 150 MB/s, 8 MB Cache, 7200 RPM
  
% hdparm -t /dev/hdc  # HDD benchmark
+
===See also===
  /dev/hdc:
+
*[http://ezix.org/project/wiki/HardwareLiSter lshw] (Hardware Lister)
  Timing buffered disk reads:  110 MB in  3.05 seconds =  36.08 MB/sec
+
*[[hdparm]] &mdash; get/set hard disk parameters
  
== Managing modules / devices / libraries / etc ==
+
==Managing modules / devices / libraries / objects / etc==
  % lspci
+
  $ lspci
  % lsmod
+
  $ lsmod
  % depmod
+
  $ depmod
  % modprobe  # tail /var/log/messages (to check success / failure)
+
  $ modprobe  # tail /var/log/messages (to check success / failure)
  % rmmod
+
  $ modprobe -l |more  # list all the modules available for your [[kernel]].
 +
$ rmmod
 +
$ ldd /path/to/library/file        # print shared library dependencies
 +
$ nm /path/to/object/file          # list symbols from object files
 +
$ nm [-s|--print-armap] /path/to/object/file # list index generated from a ranlib
  
  % ldd /usr/bin/python  # print shared library dependencies
+
  $ ldd /usr/bin/python  # print shared library dependencies
 
         linux-gate.so.1 =>  (0xffffe000)
 
         linux-gate.so.1 =>  (0xffffe000)
 
         libpython2.5.so.1.0 => /usr/lib/libpython2.5.so.1.0 (0xb7e2e000)
 
         libpython2.5.so.1.0 => /usr/lib/libpython2.5.so.1.0 (0xb7e2e000)
Line 59: Line 89:
  
 
===Default runlevel===
 
===Default runlevel===
It is a good idea to make the defaul runlevel for your machine "3" (i.e. full multiuser mode ''without'' X11). This will prevent your system from hanging if something is wrong with your X11 settings (the graphics).
+
It is a good idea to make the default runlevel for your machine "3" (i.e. full multiuser mode ''without'' X11). This will prevent your system from hanging if something is wrong with your X11 settings (the graphics).
  
 
To change the default runlevel, edit your <code>/etc/inittab</code> file and change the line that reads
 
To change the default runlevel, edit your <code>/etc/inittab</code> file and change the line that reads
Line 67: Line 97:
  
 
Now, everytime you turn on your machine (or reboot it), you will be taken to a CLI. Login as a user (''not'' root!) and enter the following:
 
Now, everytime you turn on your machine (or reboot it), you will be taken to a CLI. Login as a user (''not'' root!) and enter the following:
  % startx
+
  $ startx
  
 
  see also: [[wikipedia:init]]
 
  see also: [[wikipedia:init]]
Line 74: Line 104:
 
  see: [[SuSE wireless card configuration]]
 
  see: [[SuSE wireless card configuration]]
  
  % hostname -i          # show current IP address
+
  $ hostname -i          # show current IP address
  % hostname -d          # show current domain name
+
  $ hostname -d          # show current domain name
  % domainname            # show full domain name
+
  $ domainname            # show full domain name
  % cat /etc/hosts        # show host configuration
+
  $ traceroute
  % cat /etc/sysconfig/network  # show gateway configuration
+
$ mtr
  % cat /etc/resolv.conf  # show DNS configuration (aka "nameserver(s)"; one per line)
+
$ bwm-ng
  % cat /etc/[[iftab]]       # show MAC address (and various network interfaces)
+
$ dig
 +
$ cat /etc/hosts        # show host configuration
 +
  $ cat /etc/sysconfig/network  # show gateway configuration
 +
  $ cat /etc/resolv.conf  # show DNS configuration (aka "nameserver(s)"; one per line)
 +
  $ cat /etc/[[iftab]]   # show MAC address (and various network interfaces; only for some distros)
 +
$ cat /proc/net/arp    # show MAC address (and various network interfaces)
 +
$ arp                  # manipulate the system [[wikipedia:Address Resolution Protocol|ARP]] cache
  
  % /etc/init.d/network restart
+
  $ ip a show
  % route add 20.0.xxx.xxx gateway foo
+
$ /etc/init.d/network restart
  % /etc/rc.local
+
  $ route add 20.0.xxx.xxx gateway foo
  % /etc/sysconfig/network-scripts
+
  $ /etc/rc.local
  % /sbin/ifconfig
+
  $ /etc/sysconfig/network-scripts
 +
  $ /sbin/ifconfig
  
  % netstat -nr
+
  $ ethtool -s eth0 speed 100 duplex full autoneg off  # force full-speed traffic
 +
$ ethtool eth0  # to check that it worked
 +
$ netstat -ivn  # for tuning
 +
$ mii-tool --force=100baseTx-FD eth0  # obsolete way
 +
 
 +
$ netstat -nr
 
  Kernel IP routing table
 
  Kernel IP routing table
 
  Destination    Gateway        Genmask        Flags  MSS Window  irtt Iface
 
  Destination    Gateway        Genmask        Flags  MSS Window  irtt Iface
 
  20.0.xxx.xx    20.0.xx.xx      255.255.255.0  UGH      0 0          0 eth0
 
  20.0.xxx.xx    20.0.xx.xx      255.255.255.0  UGH      0 0          0 eth0
  
  % cat /proc/net/arp
+
  $ cat /proc/net/arp
 
  IP address      HW type    Flags      HW address            Mask    Device
 
  IP address      HW type    Flags      HW address            Mask    Device
 
  192.168.xxx.xxx  0x1        0x2        00:00:00:00:00:00    *        eth0
 
  192.168.xxx.xxx  0x1        0x2        00:00:00:00:00:00    *        eth0
 
  192.168.xxx.xxx  0x1        0x2        00:00:00:00:00:00    *        eth0
 
  192.168.xxx.xxx  0x1        0x2        00:00:00:00:00:00    *        eth0
 +
 +
$ netstat -plant  # extremely useful for troublshooting
 +
 +
===NFS===
 +
Check your <code>/etc/exports</code> for directories mountable by IP address. E.g.,
 +
/mnt/disk/data 10.0.67.53(rw) 10.0.67.123(ro)
 +
 +
Then execute the following (as root):
 +
/usr/sbin/exportfs -a
 +
 +
Add the following line (for static routes) to you <code>/etc/sysconfig/network-scripts/route-eth0</code>
 +
10.0.34.54 via 10.0.67.43
 +
 +
You can also accomplish the above via the CLI:
 +
ip route add 10.0.34.54 via 10.0.67.43 dev eth0
 +
 +
Use <code>/etc/sysconfig/network</code> for your default gateway. E.g.,
 +
HOSTNAME=foo.bar.com
 +
NETWORKING=yes
 +
GATEWAY=10.0.54.123
 +
GATEWAYDEV=eth0
 +
 +
===Logging===
 +
*If you are getting a bunch of
 +
martian destination 0.0.0.0 from xxx.xxx.xxx.xxx, dev eth0
 +
messages in your logs (check <code>dmesg |grep martian</code>), you can turn this off by editing your <code>/etc/sysctl.conf</code> and changing:
 +
net.ipv4.conf.all.log_martians=1
 +
~ TO ~
 +
net.ipv4.conf.all.log_martians=0
 +
 +
*Follow or watch the httpd live fullstatus/requests
 +
watch -n1 "cat /proc/loadavg && free -m|grep / && service httpd fullstatus|egrep 'GET|POST|VHost|request'"
  
 
===External resources===
 
===External resources===
Line 104: Line 178:
 
*[http://ndiswrapper.sourceforge.net/mediawiki/index.php/Main_Page Ndiswrapper Wiki]
 
*[http://ndiswrapper.sourceforge.net/mediawiki/index.php/Main_Page Ndiswrapper Wiki]
 
*[http://rt2x00.serialmonkey.com/wiki/index.php?title=Main_Page the rt2x00 Open Source Project]
 
*[http://rt2x00.serialmonkey.com/wiki/index.php?title=Main_Page the rt2x00 Open Source Project]
 +
*[http://linux-ip.net/ The Guide to IP Layer Network Administration with Linux]
  
== Display (Monitor / Graphics Card) ==
+
==Force umount when the "device is busy"==
  % cat /etc/X11/xorg.conf
+
$ fuser -km /mnt/hda1
  % xdpyinfo | grep dimen   # for screen dimensions
+
 
 +
==Display (monitor / graphics card)==
 +
  $ cat /etc/X11/xorg.conf
 +
  $ cat /etc/X11/xdm/Xservers  # lists commands used to start the local X-server
 +
$ xdpyinfo | grep dimen     # for screen dimensions
  
 
If you are having trouble (in [[SuSE]]) getting your monitor to display anything (either from an initial boot or from adding a new monitor), try the following:
 
If you are having trouble (in [[SuSE]]) getting your monitor to display anything (either from an initial boot or from adding a new monitor), try the following:
Line 114: Line 193:
 
* At the command prompt type: <tt>sax2 -m 0=vesa</tt>
 
* At the command prompt type: <tt>sax2 -m 0=vesa</tt>
 
* Configure video settings and ''test'' them (it is important to test your settings first!)
 
* Configure video settings and ''test'' them (it is important to test your settings first!)
* Reboot in normal mode.  
+
* Reboot in normal mode.
  
== Backing up the MBR ==
+
===External links===
It is easy to backup and restore the master boot record (MBR) in Linux. However, caution must be exorcised when performing any of the following commands.
+
*[http://www.linux.com/feature/118108 Editing basics for the xorg.conf file]
 +
*[http://www.xfree86.org/current/RELNOTES4.html XFree86 Video Drivers]
 +
*[http://linuxplanet.com/linuxplanet/tutorials/3163/2/ New HOWTO: XFree86 Font Deuglification Mini HOWTO]
 +
*[http://ati.amd.com/support/drivers/linux/linux-radeon.html ATI Proprietary Linux x86 Display Driver]
 +
*[http://intellinuxgraphics.org/index.html Linux Graphics Drivers from Intel]
  
 +
==Backing up the MBR==
 +
{{warning-dangerous}}
 +
It is easy to backup and restore the master boot record (MBR) in Linux. However, caution ''must'' be exorcised when performing any of the following commands.
 
* to backup
 
* to backup
  dd if=/dev/xxx of=mbr.backup bs=512 count=1
+
  $ dd if=/dev/xxx of=mbr.backup bs=512 count=1
 
* to restore
 
* to restore
  dd if=mbr.backup of=/dev/xxx bs=512 count=1
+
  $ dd if=mbr.backup of=/dev/xxx bs=512 count=1
 
where <code>xxx</code> is the device, which can be <code>hda</code>, <code>sda</code>, or any other.
 
where <code>xxx</code> is the device, which can be <code>hda</code>, <code>sda</code>, or any other.
  
== Sound problems ==
+
==Sound problems==
 
Note, my sound card specs: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller
 
Note, my sound card specs: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller
  
* Un-mute PCM sound
+
*Un-mute PCM sound
* Check the following:
+
*Check the following:
  % lsmod | grep snd
+
  $ lsmod | grep snd
  % cat /etc/modprobe.conf
+
  $ cat /etc/modprobe.conf
  % vi /etc/modprobe.d/sound
+
  $ vi /etc/modprobe.d/sound
 
(change "snd-intel8x0" to "snd_intel8x0")
 
(change "snd-intel8x0" to "snd_intel8x0")
  % ./sbin/lspci
+
  $ ./sbin/lspci
 
(list sound specs / Multimedia audio controller)
 
(list sound specs / Multimedia audio controller)
  % modprobe snd-via82xx
+
  $ modprobe snd-via82xx
  
== Hacked? ==
+
==Configuring a Firewire (IEEE1394) interface==
 +
Check that the file <code>/etc/modules.conf</code> (for 2.4 kernels) or <code>/etc/modprobe.conf</code> (for 2.6 kernels) contains the line:
 +
alias ieee1394-controller ohci1394
 +
Note: If you do not have a <code>/etc/modprobe.conf</code> file, there is a utility to create one. As root, type:
 +
$ /sbin/generate-modprobe.conf > /etc/modprobe.conf
 +
'''Caution''': This will overwrite the previous file (if one existed), so it would be best to back it up first. Take extra caution that it has ''not'' altered your previous (working) video and/or sound driver configurations!
 +
 
 +
You might need to reboot with your Firwire plugged in (if hotplugging is not setup) and check that the card is recognised and the module loaded with the following command (as root):
 +
$ lsmod | grep 1394
 +
  ohci1394              32240  0
 +
  ieee1394              286264  1 ohci1394
 +
 
 +
==Repair corrupted .Xauthority file==
 +
$ mkxauth -u user -c
 +
 
 +
==Adobe acroread "adobe expr: syntax error"==
 +
For some reason, after installing the latest version of Adobe Reader 7.0.9 for Linux (aka <tt>acroread</tt>; 2007-04-11) and running it, I get an infinite loop of "<code>expr: syntax error</code>".
 +
 
 +
After digging around Google a bit, I found a simple solution (not sure if this is the best one). <code>/usr/bin/acroread</code> is just a [[Bash|Bourne shell script]] text executable. Open this script and replace the following
 +
$ echo $mfile| sed 's/libgtk-x11-([0-9]*).0.so.0.([0-9])00.([0-9]*)|(.*)/123/g'
 +
# ~OR~
 +
$ echo $mfile| sed 's/libgtk-x11-\([0-9]*\).0.so.0.\([0-9]\)00.\([0-9]*\)\|\(.*\)/\1\2\3/g'
 +
with the following
 +
$ echo $mfile| sed 's/libgtk-x11-([0-9]*).0.so.0.([0-9]*)00.([0-9]*)|(.*)/123/g'
 +
# ~OR~
 +
$ echo $mfile| sed 's/libgtk-x11-\([0-9]*\).0.so.0.\([0-9]*\)00.\([0-9]*\)\|\(.*\)/\1\2\3/g'
 +
It is just the second 'match all digits' [[regex]] "<code>*</code>" symbol missing.
 +
 
 +
That should do it. Not sure why, how, or if this is the problem, but it seems to work just fine on my machine (Note: [[SuSE|openSuSE]] didn't need this fix; [[Mandriva Linux]] 2007.0 did).
 +
 
 +
==Bootsplash==
 +
I like to turn off my bootsplash (otherwise, I am always hitting the "Esc" key). This can be accomplished by setting the <code>/etc/sysconfig/bootsplash</code> file to:
 +
SPLASH="no"  # disables bootup graphics
 +
Further controls can be found in your <code>/etc/bootsplash/themes/*/config/</code> directory. For an example, in SuSE it is located here:
 +
/etc/bootsplash/themes/SuSE/config/
 +
/etc/bootsplash/themes/SuSE/config/bootsplash-1440x900.cfg  # example cfg file
 +
 
 +
==Hacked?==
 
* Check for failed logins in: <code>/var/log/messages</code>
 
* Check for failed logins in: <code>/var/log/messages</code>
 
* Regularly monitor:
 
* Regularly monitor:
 
** <pre>zcat /var/log/auth.log.*.gz | grep refused</pre>
 
** <pre>zcat /var/log/auth.log.*.gz | grep refused</pre>
 
** <pre>grep -i failed /var/log/auth.log</pre>
 
** <pre>grep -i failed /var/log/auth.log</pre>
** <tt>last</tt>
+
** <tt>last</tt> (successful logins) / <tt>lastb</tt> (unsuccessful logins)
 
** <tt>w</tt> and/or <tt>who</tt>
 
** <tt>w</tt> and/or <tt>who</tt>
 
** <tt>uptime</tt>
 
** <tt>uptime</tt>
* Verify that <code>/etc/passwd</code> hasn't changed.
+
* Verify that <code>/etc/passwd</code> has not changed.
 
* Check <tt>fuser</tt> for ports.
 
* Check <tt>fuser</tt> for ports.
 
* Search for portscans in server report.
 
* Search for portscans in server report.
 
* Check for weird processing hogging the CPU.
 
* Check for weird processing hogging the CPU.
 +
* Install and use [http://rkhunter.sourceforge.net/ rkhunter]
 
* Use [[Fail2ban|fail2ban]], [[DenyHosts]], etc.
 
* Use [[Fail2ban|fail2ban]], [[DenyHosts]], etc.
 +
===See also===
 +
*[http://www.hackinglinuxexposed.com/articles/20031214.html Hacking Linux Exposed: The mysteriously persistently exploitable program explained]
 +
 +
==See also==
 +
*[[Recovery Is Possible]] (RIP / (R)ecovery (I)s (P)ossible) &mdash; a Linux-based CD with partition tool and network tools ([[Samba]])
 +
*[http://www.hiren.info/pages/bootcd Hiren's Boot CD Home Page] mdash; a list of the software included on the boot CD.
 +
*[http://www.darknet.org.uk/2006/03/10-best-security-live-cd-distros-pen-test-forensics-recovery/ 10 Best Security Live CD Distros (Pen-Test, Forensics & Recovery)]
 +
*[http://www.911cd.net/ 911 Rescue CD] mdash; based on DOS with tools for Windows repairs (not technically a LiveDistro)
 +
*[http://www.feyrer.de/g4u/ g4u] mdash; hard disk image cloning for PCs
 +
*[http://www.sysresccd.org/ SystemRescueCd] mdash; a Linux-based CD with tools for Windows and Linux repairs, based on the 2.6 kernel.
 +
*[http://treehel.alfamoon.com/index.php?module=articles&c=articles&b=1&a=5 treehel's FreeSTAR] mdash; a free UBCD-based boot CD with a huge additional collection of free and open software for Windows and with a Russian and English Windows-interface
 +
*[http://trinityhome.org/Home/index.php?wpid=1&front_id=12 Trinity Rescue Kit] mdash; Mandriva Linux-based CD for use on a Windows or Linux based system
 +
*[http://www.ultimatebootcd.com/ UBCD] mdash; free boot CD - (U)ltimate (B)oot CD (DOS/Linux)
 +
*[http://www.ubcd4win.com UBCD4Win] mdash; based on [[BartPE]], it can also be combined with UBCD.
  
== Notes ==
+
==Notes==
 
* <tt>pstree</tt> &mdash; display a tree of processes
 
* <tt>pstree</tt> &mdash; display a tree of processes
 
* <tt>lsmod</tt> &mdash; program to show the status of modules in the Linux Kernel
 
* <tt>lsmod</tt> &mdash; program to show the status of modules in the Linux Kernel
Line 161: Line 299:
  
 
==External links==
 
==External links==
*[http://susewiki.org/index.php?title=Main_Page SuSE wiki]
 
 
*[http://ubuntuforums.org/showthread.php?t=191205 How to get specific programs to run under Dapper Drake 64-bit edition]
 
*[http://ubuntuforums.org/showthread.php?t=191205 How to get specific programs to run under Dapper Drake 64-bit edition]
 
*[http://www.tux.org/pub/people/kent-robotti/looplinux/rip/ the (R)ecovery (I)s (P)ossible Linux rescue system]
 
*[http://www.tux.org/pub/people/kent-robotti/looplinux/rip/ the (R)ecovery (I)s (P)ossible Linux rescue system]
Line 172: Line 309:
 
*[http://kevin.hatfieldfamilysite.com/?p=147 28 Steps on how to harden your linux server]
 
*[http://kevin.hatfieldfamilysite.com/?p=147 28 Steps on how to harden your linux server]
 
*[http://linux.inet.hr/how_fast_is_your_disk.html How fast is your disk?]
 
*[http://linux.inet.hr/how_fast_is_your_disk.html How fast is your disk?]
 +
*[http://cvs.mandriva.com/cgi-bin/viewvc.cgi/gi/rescue/ Mandriva rescue tree]
 +
*[http://people.redhat.com/dledford/memtest.html Linux hardware memory test script]
  
 
[[Category:Technical and Specialized Skills]]
 
[[Category:Technical and Specialized Skills]]
 
[[Category:Linux Command Line Tools]]
 
[[Category:Linux Command Line Tools]]

Latest revision as of 19:11, 6 March 2015

This category will contain a collection of articles on troubleshooting Linux. It will also include a lot of random commands I use for troubleshooting. It will be highly biased towards Red Hat-based (e.g., CentOS)) and Debian-based (e.g., Ubuntu) distros, but most of the commands should work on most Linux distros. I am also using a 64-bit (x86_64) kernel, so my articles will also be biased towards these systems.

DISCLAIMER: THIS INFORMATION IS PROVIDED TO YOU "AS-IS". NO WARRANTIES OF ANY KIND, EXPRESSED OR IMPLIED, ARE MADE TO YOU AS TO THE CONTENT OF THIS PAGE OR ANY MEDIUM IT MAY BE ON, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

I DISCLAIM ALL RESPONSIBILITY FOR THE ACCURACY AND RELIABILITY OF INFORMATION ON THIS PAGE OR PAGES IT LINKS TO.

Emergency reboot

see: wikipedia:Magic SysRq key

Alt + SysRq + REISUB ("Raising Elephants Is So Utterly Boring"; execute in slow succession)

  • For the above to work, you must have it enabled first:
echo "1" >/proc/sys/kernel/sysrq

In order to have it always enabled (e.g., after a reboot), one must in enable it in the /etc/sysconfig/sysctl file (in SuSE, at least):

ENABLE_SYSRQ="yes"

For further information see /usr/src/linux/Documentation/sysrq.txt

System information

$ dmesg
$ iostat
$ vmstat  # to quickly monitor CPU, memory, and I/O usage and decide which is the bottleneck
$ ps -ef | egrep '^root ' | gawk '{print $2}'  # method 1
$ pgrep -u root                                # method 2
$ cat /proc/cpuinfo
$ cat /proc/partitions
$ cat /proc/meminfo
$ cat /proc/sys/vm/swappiness  # number from 0 - 100; the higher the number the more the system will swap
$ cat /proc/interrupts   # inspect your /proc/interrupts file for multiple devices having the same interrupt
$ lspci # lists all PCI buses and devices connected to them
$ lsusb # lists all USB buses and any connected USB devices
$ lshal # lists all devices the hardware abstraction layer (HAL) knows about (should be most hardware on your system)
$ lshw  # lists hardware on your system, including maker, type, and where it is connected
$ uname -a  # system architecture
$ cat /etc/issue                   # Display distribution and version (on some distros)
$ lsb_release -a
$ grep ^VERSION /etc/SuSE-release  # To see which SuSE Linux version you are using
$ cat /etc/mandriva-release        # To see which Mandriva Linux version you are using
$ dmesg | head                     # full version info.
$ cat /proc/version                # full version info.
$ cat /etc/issue                   # display Linux distribution
$ pstree
$ lsof | grep TCP                  # list open files
$ lsof |grep ' root ' |awk '{print $NF}' |sort|uniq|wc -l # list number of open files for a user
$ lsof -i :22 # list all connections via port 22 (i.e., ssh)
$ strace 'command'                 # trace system calls and signals for a given command
$ getconf     # print system configuration variables
$ getconfig   # get configuration information for the Xorg server
$ systool     # view system device information by bus, class, and topology
$ dmidecode   # DMI table decoder
$ dmidecode -s system-product-name
$ dmidecode -s system-manufacturer
$ biosdecode  # BIOS information decoder
$ bind -P     # print keyboard bindings
$ acpi -t     # Check current battery charge system temperature (package might not be installed by default)
$ finger -l   # Display information about all system users
$ cat /proc/scsi/scsi
  WDC WD2000JD-22H Rev: 08.0
  SATA-I, 200 GB, 150 MB/s, 8 MB Cache, 7200 RPM

See also

  • lshw (Hardware Lister)
  • hdparm — get/set hard disk parameters

Managing modules / devices / libraries / objects / etc

$ lspci
$ lsmod
$ depmod
$ modprobe  # tail /var/log/messages (to check success / failure)
$ modprobe -l |more   # list all the modules available for your kernel.
$ rmmod
$ ldd /path/to/library/file        # print shared library dependencies
$ nm /path/to/object/file          # list symbols from object files
$ nm [-s|--print-armap] /path/to/object/file # list index generated from a ranlib
$ ldd /usr/bin/python   # print shared library dependencies
       linux-gate.so.1 =>  (0xffffe000)
       libpython2.5.so.1.0 => /usr/lib/libpython2.5.so.1.0 (0xb7e2e000)
       libpthread.so.0 => /lib/libpthread.so.0 (0xb7e16000)
       libdl.so.2 => /lib/libdl.so.2 (0xb7e12000)
       libutil.so.1 => /lib/libutil.so.1 (0xb7e0e000)
       libm.so.6 => /lib/libm.so.6 (0xb7de7000)
       libc.so.6 => /lib/libc.so.6 (0xb7cb9000)
       /lib/ld-linux.so.2 (0xb7f71000)

Default runlevel

It is a good idea to make the default runlevel for your machine "3" (i.e. full multiuser mode without X11). This will prevent your system from hanging if something is wrong with your X11 settings (the graphics).

To change the default runlevel, edit your /etc/inittab file and change the line that reads

id:5:initdefault:

to

id:3:initdefault:

Now, everytime you turn on your machine (or reboot it), you will be taken to a CLI. Login as a user (not root!) and enter the following:

$ startx
see also: wikipedia:init

Linux networking

see: SuSE wireless card configuration
$ hostname -i           # show current IP address
$ hostname -d           # show current domain name
$ domainname            # show full domain name
$ traceroute
$ mtr
$ bwm-ng
$ dig
$ cat /etc/hosts        # show host configuration
$ cat /etc/sysconfig/network   # show gateway configuration
$ cat /etc/resolv.conf  # show DNS configuration (aka "nameserver(s)"; one per line)
$ cat /etc/iftab    # show MAC address (and various network interfaces; only for some distros)
$ cat /proc/net/arp     # show MAC address (and various network interfaces)
$ arp                   # manipulate the system ARP cache
$ ip a show
$ /etc/init.d/network restart
$ route add 20.0.xxx.xxx gateway foo
$ /etc/rc.local
$ /etc/sysconfig/network-scripts
$ /sbin/ifconfig
$ ethtool -s eth0 speed 100 duplex full autoneg off  # force full-speed traffic
$ ethtool eth0   # to check that it worked
$ netstat -ivn   # for tuning
$ mii-tool --force=100baseTx-FD eth0   # obsolete way
$ netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
20.0.xxx.xx     20.0.xx.xx      255.255.255.0   UGH       0 0          0 eth0
$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.xxx.xxx  0x1         0x2         00:00:00:00:00:00     *        eth0
192.168.xxx.xxx  0x1         0x2         00:00:00:00:00:00     *        eth0
$ netstat -plant  # extremely useful for troublshooting

NFS

Check your /etc/exports for directories mountable by IP address. E.g.,

/mnt/disk/data 10.0.67.53(rw) 10.0.67.123(ro)

Then execute the following (as root):

/usr/sbin/exportfs -a

Add the following line (for static routes) to you /etc/sysconfig/network-scripts/route-eth0

10.0.34.54 via 10.0.67.43

You can also accomplish the above via the CLI:

ip route add 10.0.34.54 via 10.0.67.43 dev eth0

Use /etc/sysconfig/network for your default gateway. E.g.,

HOSTNAME=foo.bar.com
NETWORKING=yes
GATEWAY=10.0.54.123
GATEWAYDEV=eth0

Logging

  • If you are getting a bunch of
martian destination 0.0.0.0 from xxx.xxx.xxx.xxx, dev eth0

messages in your logs (check dmesg |grep martian), you can turn this off by editing your /etc/sysctl.conf and changing:

net.ipv4.conf.all.log_martians=1
~ TO ~
net.ipv4.conf.all.log_martians=0
  • Follow or watch the httpd live fullstatus/requests
watch -n1 "cat /proc/loadavg && free -m|grep / && service httpd fullstatus|egrep 'GET|POST|VHost|request'"

External resources

Force umount when the "device is busy"

$ fuser -km /mnt/hda1

Display (monitor / graphics card)

$ cat /etc/X11/xorg.conf
$ cat /etc/X11/xdm/Xservers  # lists commands used to start the local X-server
$ xdpyinfo | grep dimen      # for screen dimensions

If you are having trouble (in SuSE) getting your monitor to display anything (either from an initial boot or from adding a new monitor), try the following:

  • Reboot in Failsafe mode
  • Login as root
  • At the command prompt type: sax2 -m 0=vesa
  • Configure video settings and test them (it is important to test your settings first!)
  • Reboot in normal mode.

External links

Backing up the MBR

WARNING: This article or section describes techniques which can be dangerous for your computer's hardware or the data on it. I have tested everything I describe herein on my personal computers. However, absolutely no guarantee can be made that it will work for you. Proceed with caution!

It is easy to backup and restore the master boot record (MBR) in Linux. However, caution must be exorcised when performing any of the following commands.

  • to backup
$ dd if=/dev/xxx of=mbr.backup bs=512 count=1
  • to restore
$ dd if=mbr.backup of=/dev/xxx bs=512 count=1

where xxx is the device, which can be hda, sda, or any other.

Sound problems

Note, my sound card specs: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller

  • Un-mute PCM sound
  • Check the following:
$ lsmod | grep snd
$ cat /etc/modprobe.conf
$ vi /etc/modprobe.d/sound

(change "snd-intel8x0" to "snd_intel8x0")

$ ./sbin/lspci

(list sound specs / Multimedia audio controller)

$ modprobe snd-via82xx

Configuring a Firewire (IEEE1394) interface

Check that the file /etc/modules.conf (for 2.4 kernels) or /etc/modprobe.conf (for 2.6 kernels) contains the line:

alias ieee1394-controller ohci1394

Note: If you do not have a /etc/modprobe.conf file, there is a utility to create one. As root, type:

$ /sbin/generate-modprobe.conf > /etc/modprobe.conf

Caution: This will overwrite the previous file (if one existed), so it would be best to back it up first. Take extra caution that it has not altered your previous (working) video and/or sound driver configurations!

You might need to reboot with your Firwire plugged in (if hotplugging is not setup) and check that the card is recognised and the module loaded with the following command (as root):

$ lsmod | grep 1394
  ohci1394               32240  0
  ieee1394              286264  1 ohci1394

Repair corrupted .Xauthority file

$ mkxauth -u user -c

Adobe acroread "adobe expr: syntax error"

For some reason, after installing the latest version of Adobe Reader 7.0.9 for Linux (aka acroread; 2007-04-11) and running it, I get an infinite loop of "expr: syntax error".

After digging around Google a bit, I found a simple solution (not sure if this is the best one). /usr/bin/acroread is just a Bourne shell script text executable. Open this script and replace the following

$ echo $mfile| sed 's/libgtk-x11-([0-9]*).0.so.0.([0-9])00.([0-9]*)|(.*)/123/g'
# ~OR~
$ echo $mfile| sed 's/libgtk-x11-\([0-9]*\).0.so.0.\([0-9]\)00.\([0-9]*\)\|\(.*\)/\1\2\3/g'

with the following

$ echo $mfile| sed 's/libgtk-x11-([0-9]*).0.so.0.([0-9]*)00.([0-9]*)|(.*)/123/g'
# ~OR~
$ echo $mfile| sed 's/libgtk-x11-\([0-9]*\).0.so.0.\([0-9]*\)00.\([0-9]*\)\|\(.*\)/\1\2\3/g'

It is just the second 'match all digits' regex "*" symbol missing.

That should do it. Not sure why, how, or if this is the problem, but it seems to work just fine on my machine (Note: openSuSE didn't need this fix; Mandriva Linux 2007.0 did).

Bootsplash

I like to turn off my bootsplash (otherwise, I am always hitting the "Esc" key). This can be accomplished by setting the /etc/sysconfig/bootsplash file to:

SPLASH="no"  # disables bootup graphics

Further controls can be found in your /etc/bootsplash/themes/*/config/ directory. For an example, in SuSE it is located here:

/etc/bootsplash/themes/SuSE/config/
/etc/bootsplash/themes/SuSE/config/bootsplash-1440x900.cfg  # example cfg file

Hacked?

  • Check for failed logins in: /var/log/messages
  • Regularly monitor:
    • zcat /var/log/auth.log.*.gz | grep refused
    • grep -i failed /var/log/auth.log
    • last (successful logins) / lastb (unsuccessful logins)
    • w and/or who
    • uptime
  • Verify that /etc/passwd has not changed.
  • Check fuser for ports.
  • Search for portscans in server report.
  • Check for weird processing hogging the CPU.
  • Install and use rkhunter
  • Use fail2ban, DenyHosts, etc.

See also

See also

Notes

  • pstree — display a tree of processes
  • lsmod — program to show the status of modules in the Linux Kernel
  • modprobe — program to add and remove modules from the Linux Kernel
  • netstat — Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships
  • lspci — list all PCI devices
  • more /usr/share/pci.ids — A list of all known PCI ID's (vendors, devices, classes, and subclasses). Maintained at The Linux PCI ID Repository, use the update-pciids utility to download the most recent version.

External links

This category currently contains no pages or media.