Flipkart

Showing posts with label state clean. Show all posts
Showing posts with label state clean. Show all posts

Saturday, 11 February 2017

File System State is clean with errors in Linux

ISSUE: File system on device reported state clean with errors in Linux

SOLUTION:

 ==>This issue is usually reported in /var/log/messages or may be noticed at boot time and is really critical.

==>This issue can be dangerous when data consistency is considered on reported File System
==>File System errors can be noticed on root file system, SAN file systems or even cluster file systems.

Steps to fix the issue for Non-Root file systems:
===================================

Suppose say /dev/sda10 (/fs1) is showing file system state clean with errors.
you can see how /dev/sda10  and /fs1 are related in df -kh command output

Example:
/dev/sda10             7.8G  6.3G  1.2G  85% /fs1

1. Take backup of the data from /fs1 directory and place  the backup on a different system if possible and one copy locally.

Use the below command to tar up the data and to create a tar file of the same.        

#tar -cvzf /fs1_<date>.tar.gz /fs1/*

You will have a tar file something like /fs1_20150201.tar.gz (Just replace the date as per your requirement in the command)

2. #df -kh

if you see the file system /fs1 is mounted, then unmount it like below:

#umount /fs1

Just check the status of the errors before we try to fix the issue actually.

Below command will not fix the issue rather tells us how many errors are present on the FS.

3.#fsck -n /fs1

Output is like below:
#fsck -n /fs1
fsck 1.39 (29-May-2006)
e2fsck 1.39 (29-May-2006)
Warning: skipping journal recovery because doing a read-only filesystem check.
/fs1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached zero-length inode 179868.  Clear? no
Unattached inode 179868
Connect to /lost+found? no
Pass 5: Checking group summary information
Block bitmap differences:  +386446
Fix? no
Free blocks count wrong for group #11 (2, counted=3).
Fix? no
Inode bitmap differences:  +179868
Fix? no
Free inodes count wrong for group #11 (15350, counted=15351).
Fix? no
Free inodes count wrong (2072117, counted=2072133).
Fix? no

/fs1: ********** WARNING: Filesystem still has errors **********
/fs1: 12747/2084864 files (10.0% non-contiguous), 1953016/4162197 blocks

4.#tune2fs -l /dev/sda10

tune2fs 1.39 (29-May-2006)
Filesystem volume name:   /fs1
Last mounted on:          <not available>
Filesystem UUID:          cf72df5e-24fa-4f40-92c6-244459r49e17
Filesystem magic number:  0xEF18
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file
Default mount options:    (none)
Filesystem state:         Clean with errors
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              2081264
Block count:              4164597
Reserved block count:     208789
Free blocks:              2209181
Free inodes:              2074517
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1016
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16288
Inode blocks per group:   509
Filesystem created:       Mon Oct 26 02:50:22 2009
Last mount time:          Sat Feb 11 09:12:30 2017
Last write time:          Sat Feb 11 09:12:30 2017
Mount count:              68
Maximum mount count:      25
Last checked:             Wed May 29 20:38:40 2013
Check interval:           15552000 (6 months)
Next check after:         Mon Nov 25 19:38:40 2013
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal inode:            8
Default directory hash:   tea
Directory Hash Seed:      b16ea660-bcea-4a0a-8e9e-926b058e1928
Journal backup:           inode blocks
Things to note in the above command is File system state: clean   OR Not Clean

Check for any errors and note down the output
Looking at the above commands output, /fs1 is not clean and fsck has shown errors. So needs to be cleaned to avoid any data corruption.

5. Start the FSCK now to fix the above seen issues on /fs1 file system

Important Note: Running fsck -f -y on a mounted system will cause data loss. Be cautious.

#screen
#fsck -f -y /dev/sda10
This command may run for 4 to 5 Hrs or may be more. Monitor till the command is executed completely

6. Verify if the errors are gone on /fs1 using tune2fs

#tune2fs -l //dev/sda10
file system state should be clean.
#fsck -n /dev/sda10
check if any errors are still there.

7. If /fs1 is clean, mount it back and start using it as usual

#mount /fs1

=====

Running FSCK on root file system needs to be done through single user mode or maintenance mode in Linux as we cannot unmount root FS.

HAPPY LINUX LEARNING :)

Feel free to ask any questions or start a discussion about this topic.

My other Posts are below:

File System State is clean with errors in Linux:
http://linuxunixdatabase.blogspot.com/2017/02/file-system-state-is-clean-with-errors.html

How to use IPERF to test interface/network throughput in Linux:
http://linuxunixdatabase.blogspot.com/2017/02/how-to-use-iperf-to-test.html

Linux/Unix Network Troubleshooting:
http://linuxunixdatabase.blogspot.com/2017/02/linuxunix-network-troubleshooting.html

Removing existing LVM from your Linux System
http://linuxunixdatabase.blogspot.com/2017/02/removing-existing-lvm-from-your-linux.html

Learning AWK and SED Tools for LINUX/UNIX
http://linuxunixdatabase.blogspot.com/2017/02/learning-awk-and-sed-tools-for-linuxunix.html

Saturday, 4 February 2017

Linux/Unix Network Troubleshooting

This document outlines the troubleshooting process of network interface related issues on a Linux/UNIX server

I do not post man page details here because every Linux distribution has man page details for all the commands.


I am taking a scenario here to discuss as to what can be done to fix a network interface issue on a Linux server.


Issue: Network connectivity is lost or throughput is ceased(getting very less speed comparatively) all of a sudden on my Linux server.

SOLUTION:

Things to do in sequential order:

You need to be a root user for all these steps.

Take output of #ifconfig command:

Example:

#ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 01:89:A8:G8:4R:54
         inet addr:192.168.9.5  Bcast:192.168.99.255  Mask:255.255.255.0
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:190458 errors:0 dropped:0 overruns:0 frame:0
         TX packets:86768 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:150
         RX bytes:30701269 (29.3 Mb)  TX bytes:7878926 (1.9 Mb)
         Interrupt:9 Base address:0x5000

Check the details from the above output one by one as shown below:

1.Make sure this ip (192.168.9.5) does not have a duplicate entry in your network which means, any IP address has the capacity to server a single interface at a given time. It cannot be active on two or more devices at a time.

If you find a duplicate entry in network for this IP, make sure to drop the other connection and try the functionality of the interface. You can get in touch with your network team here to identify why this IP has a duplicate entry in the network and then to fix it.

2. Reset the problematic interface using below commands

#ifdown eth0
#ifup eth0

Check the functionality, proceed if the issue persists

3. Open the file /etc/sysconfig/network-scripts/ifcfg-eth0 and verify  the MAC address which is also called as Hardware address of the NIC (Network Interface Card) HWaddr 01:89:A8:G8:4R:54 is same as in the above output.

If you see any discrepancy, adjust the MAC address of the NIC to your interface (eth0 in this case).

There are different ways to find out the MAC address of an NIC in linux. You may need to google it which for the way which suits your case.

For an existing connection, ifconfig should be sufficient.

Check the functionality of the interface once the mac address is set appropriately.

4. If the issue still persists, next thing to check is RX packets and TX packets errors and drops.

RX packets/bytes ⇒ Receiving Data by server
TX packets/bytes ⇒ Sending Data by server

#Ethtool -S eth0

⇒ is the best command to see rx/tx or crc related errors on the interface

If you see any RX packets have errors or packets were dropped, this means data is becoming faulty even before it reaches the target server.

So, you need to consider resetting the switch port at this point and see if the errors or the dropped packets are still increasing.

If the errors continue to increase and drop packets are also increasing, first thing to suspect is NIC on your Linux server.

Check the NIC firmware version using #ethtool -i eth0 and see if it is up to date, if not go ahead and update the NIC firmware according to make of NIC.

Check the functionality, if the issue still persists, go ahead and replace your NIC and check the functionality

Same is true for TX errors and drops as well.

5. If the issue still persists, check the cable functionality with the help of your Data Center people and change the cable from external switch to your Linux server if found faulty.

6.If the issue persists, try host reboot and check the functionality

7. With the above steps, we isolated the issue on various aspects and we are clean from server point of view.

It's time to push  the issue to your Network Team and get it fixed from switch point of view.

=======
HOW TO TEST NETWORK SPEED: Use the link below

http://linuxunixdatabase.blogspot.in/2017/02/how-to-use-iperf-to-test.html#!/2017/02/how-to-use-iperf-to-test.html

OR
click on Using IPERF COMMAND LINUX
========

HAPPY LINUX LEARNING :)

Feel free to ask any questions or start a discussion about this topic.

My other Posts are below:


File System State is clean with errors in Linux:
http://linuxunixdatabase.blogspot.com/2017/02/file-system-state-is-clean-with-errors.html

How to use IPERF to test interface/network throughput in Linux:
http://linuxunixdatabase.blogspot.com/2017/02/how-to-use-iperf-to-test.html

Linux/Unix Network Troubleshooting:
http://linuxunixdatabase.blogspot.com/2017/02/linuxunix-network-troubleshooting.html

Removing existing LVM from your Linux System

http://linuxunixdatabase.blogspot.com/2017/02/removing-existing-lvm-from-your-linux.html

Learning AWK and SED Tools for LINUX/UNIX
http://linuxunixdatabase.blogspot.com/2017/02/learning-awk-and-sed-tools-for-linuxunix.html