Thursday 17 December 2009

Basic and fundamental knowledge

I was recently asked with some basic and fundamental question, directly related to Linux system administration job. When you are already a senior engineer, most of the time you will forget about the fundamental, or don't really care about it. But truth is, you still need to know about it, because it is so fundamental. People will judge you with this basic information to know that if you are really knowledgeable and your level is what you really say you are at. I'm not able to answer all the question satisfactorily, and feel somewhat ashamed about it.

Let's learn it together.


1. What is the difference between a network hub and a network switch? 

When I search Google for the answer, I believe this site put it very nicely. Take a look at http://www.duxcw.com/faq/network/hubsw.htm . In essence:
  • Hub repeats the packet it receive on one port to the other port available
  • The bandwidth is shared across all the ports. If the hub is 10Mbps, with 5 ports, then each port can only transfer at max 2Mbps
  • Switch divides the network into multiple segment thus a pair of ports can communicate without affecting other pair of ports
  • Switch maintains a table of destination address and its port, so when a packet arrives, it will send the packet to the correct port
  • The bandwidth of the port is dedicated. If the switch is 10Mbps, with 5 ports, when port 1 connect to port 2, the bandwidth is 10Mbps for that instance, and when port 3 connect to port 4, the bandwidth is also 10Mbps for that instance

2. How many bit are there in a MAC address?

I cannot answer this question correctly. The answer is 48 bit. This site provides the information: http://compnetworking.about.com/od/networkprotocolsip/l/aa062202a.htm .  In essence:
  • MAC address have 12 digit hexadecimal number
  • 1 hex = 4 bit, thus 12 hex = 48 bit
  • Hex symbol is 0123456789ABCDEF
  • The first 6 hex digit represent the manufacturer

3. What is the difference between TCP and UDP?

Wikipedia have the answer: http://en.wikipedia.org/wiki/User_Datagram_Protocol#Comparison_of_UDP_and_TCP
  • TCP is Transmission Control Protocol
  • TCP is connection oriented link
  • When a machine send a TCP packet, the receiving machine have to send back acknowledgment packet when it arrive
  • If sending machine fail to get the acknowledgment after certain time period, the packet will be resend again.
  • Example of TCP usage is between web server and web browser
  • UDP is User Datagram Protocol
  • UDP never guarantees that a packet will arrive at destination
  • When a machine send a UDP packet, it is not expecting acknowledgement from the receiving machine.
  • Example of UDP usage is audio streaming, and DNS

3. What information are available in TCP packet? Name some of them.
  • Source address
  • Destination address
  • Checksum

4. What flag are available in TCP packet? Name some of them.

I also cannot answer this question correctly. The answer are:
  • SYN
  • ACK


5. When you execute "uptime" command, there are 3 numbers at the end of the line. What are they?

Answer here: http://linux.die.net/man/1/uptime . The 3 numbers are load averages for the past 1, 5, and 15 minutes.


6. What is the meaning of the load average number?

Here is the answer: http://www.lifeaftercoffee.com/2006/03/13/unix-load-averages-explained/ . It means "the average sum of the number of processes waiting in the run-queue plus the number currently executing over 1, 5, and 15 minute time periods."


7. How do you know that the server is busy from the number?

The best answer probably from this site: http://www.teamquest.com/resources/gunther/display/5/index.htm . For this question, I answered if the number is 2 or bigger, then the server is busy or under heavy load. This is relative, and you will know from experience handling Linux or Unix machines.


8. What happen when the machine is busy?

I answered, the most noticeable clue is that you have trouble accessing the server remotely. When you SSH to the server, it will take a while before you are able to login. This is because the SSH connection is encrypted and the server will need to decrypt the data before able to give you access. Encryption and decryption takes big amount of CPU cycle, and if the machine is already busy, you will see it will take some time before you are able to login.

Other than that, if the machine have small amount of memory, you will see a lot of disk activity, because the OS is swapping the application that resides in memory, but is not executed, to the disk to make way for application that have higher priority.


9. How do you list processes running in the machine?

Use the command "ps"



10. How do you terminate a misbehaving application?

Use the command "kill -9 <appname>". The number 9 is sending the SIGKILL signal. To terminate application with the same name, use "pkill <appname>"


11. What other signal available?

This site summarize it: http://linux.about.com/od/commands/l/blcmdl7_signal.htm . Other signal available is SIGHUP and SIGTERM. I answered SIGHUP is to restart an application, but most information in Internet said that SIGHUP is to re-read configuration file or to stop an application. You might need to search for more concrete answer.

SIGTERM is terminate signal sent to application to stop it gracefully. When application receive a SIGTERM, it will do the necessary process to make sure it is stopped cleanly.


12. What is the difference between SIGKILL and SIGTERM?

SIGTERM is a graceful termination signal. The application that receive the signal will try its best to stop or notify any dependency, and then terminating itself. For example, if a parent process got SIGTERM, and it has few child process, the parent process will notify the child process that the parent is being terminated, and the parent might also send SIGTERM to the child to terminate them before terminating itself. SIGTERM can be ignored if the application was programmed to do so.

For SIGKILL, the application will be directly terminated by the OS. No information will be sent to any dependencies of the program. If a program have a child process, that child process might become orphan or zombie because its parent has been killed and it has no clue on what to do next. This kind of issue might cause further instability to the server if the server is already have some issue.


13. Have you experienced application that will not terminate even after you send SIGKILL signal? How do you terminate such application?

More info here: http://en.wikipedia.org/wiki/Zombie_process . That application is called zombie application. To find zombie application, use command "ps aux | grep Z", where the zombie application will have Z as its status. You cannot killed a zombie application because it is already dead. What I usually do is, if the zombie process have a parent process, I will terminate the parent process, where most of the time the zombie process will terminate because its dependency to its parent has been terminated.

But, there are cases that, when you terminate the parent process of the zombie, the zombie will then use process with PID 1 as its parent. Process with PID 1 is the init process, and it is the first process to run when the server starts. If this happen, you have no other way to kill the zombie other than rebooting the server.


14. What actually pkill command do?

pkill will list the PID of the process that have the name as specified, then will terminate the application one by one. The default signal sent is SIGTERM.


That's all. I have learned quite a lot after this event. Hopefully this post will be helpful to someone out there.
Got comment? Let me know. :)