Flipkart

Showing posts with label logs. Show all posts
Showing posts with label logs. Show all posts

Friday, 3 March 2017

/var/log/messages log file in Linux (Red Hat and Centos)

**********Logs in Linux (Centos&RedHat) - PART1**********

Click here: Logs in Linux (Centos&RedHat) - PART2
Click here: Logs in Linux (Centos&RedHat) - PART3

There are different log files available on Red Hat and Centos Linux Servers for different information like for kernel activities, services running on the Linux server and the applications that are deployed on the server.

This article is dedicated for the log file which I like most and has plenty of information about the system and various issues is /var/log/messages especially from a enterprise infra maintenance of a Linux Server point of view.

After reading the details below you will surely come to know how and what to look for in /var/log/messages log file which will help you to fix various Issues in Linux.

/var/log/messages log file is basically a read only file for users as the data is written to this file by system itself.

 You can use tools such as more, less, head, tail or vi to view the contents of this file.

Example:

#tail -n 5 /var/log/messages

Feb 2 07:35:44 server1 cib: [22388]: info: cib_stats: Processed 7 operations (5714.00us average, 0% utilization) in the last 10min
Feb 2 07:37:29 server1 PowerPath: Management Component: Warning: Cannot retrieve devices from MPAPI.
Feb 2 07:45:44 server1 cib: [22388]: info: cib_stats: Processed 7 operations (2857.00us average, 0% utilization) in the last 10min
Feb 2 07:47:29 server1 PowerPath: Management Component: Warning: Cannot retrieve devices from MPAPI.
Feb 2 07:55:44 server1 cib: [22388]: info: cib_stats: Processed 8 operations (5000.00us average, 0% utilization) in the last 10min


This file mainly has details about server startup/shutdown logs, messages related to storage functionality which is attached to server, local file system related errors, network ports link status, server restart time and cluster related messages if your server is in cluster.

I would suggest one has to look into /var/log/messages for issues like abrupt system reboot, SAN file system hung, fsck errors of local file systems  and network connectivity issues etc..to find a  reason for the issue.

There will be bunch of logs in the file  but the trick is to match  the time at which issue was first noticed and time stamp in the log file.

Look for the logs before and after around  that time frame.

Trace from /var/log/messages for different issues:

1. PowerPath: Management Component: Warning: Cannot retrieve devices from MPAPI.
                                      OR
    kernel: sd 3:0:0:0: SCSI error: return code = 0x00010000

This message indicates an issue with EMC storage that may be attached to server or just recently removed incorrectly from server.

 Fix:

Check the status of SAN file systems using #df -kh command whether the FS is healthy and read/write operation is possible.
Also check for other messages related to this issue in /var/log/messages till we get a conclusion of the issue.

2. kernel:eth0: link status is down

This message shows that the network interface eth0 is currently down and needs action to fix the issue.

 Fix:

Refer my other post below to troubleshoot network issues on Linux servers.
http://linuxunixdatabase.blogspot.com/2017/02/linuxunix-network-troubleshooting.html

3. If you see something like file system is corrupted or has errors

Refer my other post below to troubleshoot the fs related errors

http://linuxunixdatabase.blogspot.com/2017/02/file-system-state-is-clean-with-errors.html

4. Feb 19 04:02:02 server1 syslogd 1.4.1: restart.
This message actually indicates syslogd restart time. In my experience, I have seen this message shows the time lines which is very close to the server reboot.

For exact time of Linux Server restart, use the below command:

[root@server1 ]# last | grep boot
reboot   system boot  2.6.18-308.el5   Sat Nov 19 09:38         (97+00:06)
reboot   system boot  2.6.18-308.el5   Sat Oct 22 10:49         (27+23:33)


You may see many other issues while you are working, please feel free to post the issues here. I can surely get something interesting for you accordingly.

Click here: Logs in Linux (Centos&RedHat) - PART2
Click here: Logs in Linux (Centos&RedHat) - PART3

HAPPY LINUX LEARNING :)

Logs in Linux (Centos&RedHat) - PART2


This week's article from My blog is about must know Linux(RHEL/CentOS/Fedora) logs apart from /var/log/messages which was covered in Logs in Linux (Centos&RedHat) - PART1 .

 As you might already have noticed, my blog gives preference to practical implementation of the knowledge rather than just putting it down on a paper.
 
 
Below are the list of logs which can be used on daily basis by an enterprise Linux System Administrator and by the people who is fond of Linux to fix various issues on a Linux Server :). 

1. /var/log/maillog:


This log gives the information about the mail server application which is deployed on your Linux server.


 Different mail servers  that can be deployed on a Linux Servers are:
 
1.Mutt – Command Line Email Client (default in Linux)
2.Sendmail
3.Qmail
4.Postfix
5.Alpine
6.Exim
7.Zimbra
 
Entries in /var/log/maillog file are usually like below:
 
Feb 21 04:05:01 Server1 sendmail[1120]: v1Q9916l008142: from=username, size=374, class=0, nrcpts=1, msgid=<2017022119263 .v1q9916l008142="" server1.domainname.com="">, relay=username@localhost
 
Feb 21 04:10:01 server1 sendmail[1953]: v1a0d1rc019730: from=, size=647, class=0, nrcpts=1, msgid=<201702210905 .v1a0d1rc019730="" server1.domainname.com="">, proto=ESMTP, daemon=MTA, relay=username [127.0.0.1]
 
Feb 21 04:10:01 server1 sendmail[1246]: v1Q9x2vg319764: to=username, ctladdr=username (27341/674), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30374, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (v1Q9x2vg319764 Message accepted for delivery) 
 
The above lines show, at what time the message was sent or received, server name, the mail server deployed on Linux, message ID, message size, protocol (Usually SMTP) and relay(Mail server) used for the message to be sent/received. 
server1   ⇒ Linux Servername;
sendmail ⇒ is mail server  that was deployed on my Linux Server;
v1Q9916l008142⇒ message id

 When do you look into maillog?


 1.If mails are not sending or receiving as expected or mails are not at all working

2.If mail sending and receiving is delayed
3.If you see mail server is not accepting the connections message when you try to send an email
4.To see if any spamming is happening or if the messages are still in Queue
5. Just to make sure no other errors and warnings are present in the logs as a regular practise to keep your mail server moving without any disruption. 

Symptoms to doubt in maillog: 


  • Rejecting connections
Fix:  Make sure your smtp server which is configured in /etc/mail/sendmail.cf is reachable and functioning
  • unable to qualify my own domain name (localhost)
Fix:  Add the below line to make this work
(Replace server1 with your server name)
127.0.0.1  server1 localhost.localdomain localhost  

2. /var/log/lastlog:

 

This log file is a data file unlike other log files which are  text files.So, we cannot directly read this file using vi/more/vim/tail/head/cat like commands.
[root@server1 ~]# file /var/log/lastlog
/var/log/lastlog: data
[root@server1 ~]# file /var/log/secure
/var/log/secure: ASCII text
  • Linux has provided #lastlog command to get the readable details from /var/log/lastlog file.
  • #lastlog command gives the details about most recent login of all users or of a given user
Sample Output of #lastlog command is as below:
 
allen                               **Never logged in**
ntp                                 **Never logged in**
appuser           pts/1     Fri Sep 16 15:35:56 -0400 2016
albert            pts/0    192.168.1.1      Wed Mar 16 22:35:05 -0400 2016
general           pts/0    192.168.2.1     Fri Mar 30 22:02:26 -0400 2012
ftp_user          pts/2    x.x.x.x      Wed Jun 25 14:16:06 -0400 2014 
 
In the above output, first column represents username, third column shows the source system from where user jumped onto the target server and the last column shows the most recent login time of the user. 
 
We can get the logon details of a particular user as well like below:
 
[root@server1 ~]# lastlog -u abc
Username         Port     From                         Latest
abc              pts/2    anotherserver.domain.com  Wed Feb  5 10:55:13 -0500 2017
 
 
PURPOSE OF THIS FILE:
 
  1. The output from this file/command can be used to track user’s recent login details or what users visited the server in the recent past.
  2. This file will NOT have any errors, so can be treated as an informative file and rarely used in any troubleshooting.
 
Last but not least,Linux Man Page is available for #lastlog command 

3. /var/log/wtmp:


 This file is also a data file like /var/log/lastlog.

Linux has provided #last OR #lastb commands to read this file to get required information.
However, in some cases, the file /var/log/wtmp may not be present as defined in admin’s local configuration. 
PURPOSE OF THE COMMAND #LAST:
 
  1. #last command displays a list of all users logged in (and out) on the Linux Server
  2. Use #last command to find out easily who was logged in at a particular time (need to specify that time with -t ).
  3. No need to bother about the data file(wtmp) anytime as we have last available readily.
  4. To find out server’s last reboot time details  
Sample output of the command#last: 
root        pts/2        192.168.1.1         Wed Jan 14 10:05 - 11:28  (00:23)
root        pts/0        192.168.1.2         Wed Jan 14 09:29 - 9:11   (00:42)
root        pts/0        server2.domain.com  Tue Jan 13 01:02 - 09:13  (02:11)
appuser   pts/3        192.168.1.4         Mon Jan 12 14:54 - 11:05  (03:11)
admin pts/0        server1.domain.com   Thu Jan  8 00:04 - 00:07  (00:00)
 
Column1 =>Username
Column2=>tty
Column3=>jump server from where user logged onto our Linux Server
Column4=>User logon time
 
IMPORTANT POINT:
 
The pseudo username “reboot” logs in each time the system is rebooted for any reason. So the command  “#last reboot” will show a log of all reboots or at least the last 5 reboots of server which is very useful to troubleshoot any server availability related issues.
 
[root@server3 ~]# last reboot
reboot   system boot  2.6.18-308.el5   Tue Jun 2  09:33         (247+19:50)
reboot   system boot  2.6.18-308.el5   Thu Jan 1  23:47         (529+08:41)
reboot   system boot  2.6.18-274.3.1.e Thu Jan 09 23:29          (00:12)
reboot   system boot  2.6.18-194.26.1. Thu Jan 17 23:13          (00:12)
reboot   system boot  2.6.18-194.26.1. Thu Jan 12 22:29          (00:38)
**There is #lastb command as well which exclusively shows only the logins which are failed or the so called bad logins. 
------------------------------------THE END OF PART2-----------------------------------

HAPPY LINUX LEARNING :)
Click here: Logs in Linux (Centos&RedHat) - PART1
Click here: Logs in Linux (Centos&RedHat) - PART2
Click here: Logs in Linux (Centos&RedHat) - PART3
 

Logs in Linux (Centos&RedHat) - PART3

 
Click here : Logs in Linux (Centos&RedHat) - PART2
4. /var/log/secure:
 
This is a text file and can be seen using tail/head/vi/cat/more like commands.
This file has the information related to server authentication of users and their authorization privileges.
So, user logins using ssh and telnet services will be tracked here including unsuccessful login attempts. 
Sample output from this file for a successful login is as below:
 
Feb 26 04:10:01 server2 sudo:   appuser : no tty present and no askpass program specified ; TTY=unknown ; PWD=/home/appsur ; USER=root ; COMMAND=/admin/cron
 
Feb 26 04:10:22 server2 sudo:   appuser : TTY=unknown ; PWD=/home/otheruser ; USER=root ; COMMAND=/usr/bin/view
 
Mar  1 04:59:14 server1 su: pam_unix(su-l:session): session opened for user abc by root(uid=0)
Mar  1 04:59:16 sever1 su: pam_unix(su-l:session): session closed for user abc
 
Unsuccessful login entry in /var/log/secure file:
 
Mar  1 05:01:50 server1 sshd[15222]: Failed password for abc from 192.168.1.1 port 41043 ssh2
Mar  1 05:01:54 server1 sshd[15172]: Failed password for abc from 192.168.1.1 port 41047 ssh2
Mar  1 05:01:58 server1 sshd[16172]: Failed password for abc from 192.168.1.1 port 41047 ssh2
Mar  1 05:01:58 server1 sshd[15279]: Connection closed by 192.168.1.1
Mar  1 05:01:58 server1 sshd[15175]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=server1.domainname.com  user=abc
 
I got the above logs after trying 3 times with incorrect password for user “abc”
 PURPOSE OF THIS FILE:
  1. To validate if any unauthorised user is trying to logon to your server.
  2. To see if some user is trying to use the privileges which are not attributed to him
  3. To protect your server from any suspicious login attempts
 
Relevant Command to /var/log/secure file:
 
We have something like #faillog command to exclusively get login failure attempts of all users or a particular user and even to set the lockout limits after unsuccessful login attempts
 
#faillog -a  ⇒ displays the login failure attempts of all the users on Linux Server
#faillog -u ⇒ displays the login failure attempts of the mentioned user on Linux Server
 
 
Example:
[root@server1 ~]# faillog -u abc
Login       Failures Maximum Latest                   On
abc             0        0   12/31/16 19:00:00 -0500
 

5. /var/log/dmesg:

  • I feel this is second important file in Linux after /var/log/message as it has the information about the entire hardware which is associated with your server.
  • This file loads the kernel ring buffer information. When the Linux server boots up, we see number of messages on the screen about the hardware devices that the kernel detected during boot process.
  • These messages are available in kernel ring buffer and are usually overwritten whenever the new message comes.
The content of this file can also be ready by the command #dmesg
 
 PURPOSE OF THIS FILE:
We usually refer this file or use the command #dmesg to get information about memory available on the system, network cards installed , USBs which are being used on this system, to see all serial ports details (ttys), to check number of CPUs and how many are hot pluggable among them etc..
 
Example:
 
[root@server4 ~]# dmesg | grep -i memory
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
Memory: 16407184k/18874368k available (2616k kernel code, 358416k reserved, 1672k data, 224k init)
Freeing initrd memory: 2685k freed
Total HugeTLB memory allocated, 0
Non-volatile memory driver v1.2
Freeing unused kernel memory: 224k freed 
 

6. /var/log/cron:

 
This file logs the information whenever cron daemon (or anacron) starts running a cron job.
This is a readable file.
Sample Output from /var/log/cron: 
Feb 2 01:30:01 server1 CROND[200]: (abcuser) CMD (/home/usr/fssize > /dev/null 2>&1)
Feb 2 02:30:01 server1 CROND[201]: (abcuser) CMD (/home/usr/maintenance.sh > //home/abcuser/log/maintenance.out)
 
 
What If, your cron job did not run and you do not know where to see?
 
Check the below things:
  1. See if there is any entry in /var/log/cron file for your job, if yes, check the log file which received an output from your script.
  2. It is recommended a cron job output should generally be redirected to a log file which helps us to debug the job in case of any issues.See if something is written to that output file
  3. See if your job is commented out in crontab using #crontab -l command
  4. Try to tally the time lines of /var/log/cron file and your job start time approximately to see if at all the job was kicked off
 
Apart from the logs mentioned above, you may find the below logs on your Linux Server if you have those applications/servers are installed.
 
7./var/log/yum ⇒ This file will have details above yum command activities
8./var/log/mysqld.log ⇒ This file will have the details about sql server if you are using it
9./var/log/httpd ⇒ Apache access and error logs directory
 
Click here : Logs in Linux (Centos&RedHat) - PART2

HAPPY LINUX LEARNING :)