[Tutorial] Checking, reading, and understanding "Server Load" numbers

Started by Xhanch Studio, January 13, 2012, 05:07:52 PM

previous topic - next topic
Go Down

Xhanch Studio

Server load is a number in x.xx format with value starting from 0.00. It expresses how may processes are waiting in the queue to access the processor(s). Of course, this is calculated for a certain period of time and of course, the smaller the number, the better. A high number is often associated with a decrease in the performance of the server.

So, it clearly states CPU load is going to let you know the amount of processes your server processor going to execute Or is it? Let me tell you something, it is a very wrong definition of CPU load. Let me show you something from the manual of uptime unix command:
Quoteuptime gives a one line display of the following information. The current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes


That means, the average you see, is not showing you the waiting processes, but the processes waited for past 1, 5 and 15 minutes.  (Solaris includes some runtime processes, but can not at all predict processes waiting for next 1 minute queue) So, if the cpu load is 4 and you have 4 cpus, does that mean 4 processes were waiting for cpu access in last 60 seconds? Does that seem pretty a lot for current RAID controlled hard drives and 4/8GB RAM servers? Just to note, linux kernel treats threads as a process. It is possible to improve the performance a lot using threads, this is why, most of the people are utilizing thread based models these days. So, the waiting 4 processes could be 4 threads as well when linux is concerned. And in some cases, threads can be served faster than processes.


How to check the servers load?

There are a few different ways to keep an eye on your servers load, the first thing you need to do is login to your server by SSH.

  • [spoiler=Using the uptime command]The uptime shell command produces the following output:

    Quote[pax:~]% uptime
    9:40am up 9 days, 10:36, 4 users, load average: 0.02, 0.01, 0.00


    It shows the time since the system was last booted, the number of active user processes and something called the load average.[/spoiler]

  • [spoiler=Using the procinfo command]On Linux systems, the procinfo command produces the following output:

    Quote[pax:~]% procinfo
    Linux 2.0.36 (root@pax) (gcc 2.7.2.3) #1 Wed Jul 25 21:40:16 EST 2001

    Memory: Total Used Free Shared Buffers Cached
    Mem: 95564 90252 5312 31412 33104 26412
    Swap: 68508 0 68508

    Bootup: Sun Jul 21 15:21:15 2002 Load average: 0.15 0.03 0.01 2/58 8557


    The load average appears in the lower left corner of this output.[/spoiler]

  • [spoiler=Using the w command]The w command produces the following output:

    Quote[pax:~]% w
    9:40am up 9 days, 10:35, 4 users, load average: 0.02, 0.01, 0.00
    USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
    mir ttyp0 :0.0 Fri10pm 3days 0.09s 0.09s bash
    neil ttyp2 12-35-86-1.ea.co 9:40am 0.00s 0.29s 0.15s w


    Notice that the first line of the output is identical to the output of the uptime command.[/spoiler]

  • [spoiler=Using the top command]The top command is a more recent addition to the UNIX command set that ranks processes according to the amount of CPU time they consume. It produces the following output:

    Quote4:09am up 12:48, 1 user, load average: 0.02, 0.27, 0.17
    58 processes: 57 sleeping, 1 running, 0 zombie, 0 stopped
    CPU states: 0.5% user, 0.9% system, 0.0% nice, 98.5% idle
    Mem: 95564K av, 78704K used, 16860K free, 32836K shrd, 40132K buff
    Swap: 68508K av, 0K used, 68508K free 14508K cached

    PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
    5909 neil 13 0 720 720 552 R 0 1.5 0.7 0:01 top
    1 root 0 0 396 396 328 S 0 0.0 0.4 0:02 init
    2 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 kflushd
    3 root -12 -12 0 0 0 SW< 0 0.0 0.0 0:00 kswapd
    ...


    We like to use the top command because it also shows server uptime, memory information and the list of processes that you can sort by CPU usage, etc.[/spoiler]



What is a good load, bad load and in between?

I know you're asking, so what is a good system load or what is a bad load? Anything around 1.0 and below is fine, try to stick to under 1.0 for regular load averages. If you notice your server slowing down, check the load first. We hosted a site that was mentioned on the media (TV, News, Radio) recently and the server skyrocketed because of the huge wave of traffic. The load went from 0.25 to 37.00 just because the server was getting hammered.

When your regular average starts to creep up around 2.0 then your server is very busy and you should consider getting another machine or upgrading your hardware. When I say regular average, I mean when the system is idle during the day and isn't processing all your logs or backing up data.

Having an overloaded server can lead to many problems and should always be avoided. I hope this guide was helpful by giving you some more insight to server loads, what to use to monitor them and what is a good and bad load average.
Best Regards,
Susanto B.Sc
----------------------------------------------------------------------------
Web development services, WordPress plugin and theme development, PSD to XHTML conversion - http://xhanch.com
Read free manga online - http://authrone.com

Go Up