In this article, we cover almost every aspect of the TCP packet capture and analysis tool called TCPdump. This is our TCPdump Guide.
- What is TCPdump
- Where do I get TCPdump
- TCPdump behavior
- TCPdump for Windows
What is TCPdump?
TCPdump is a UNIX tool used to gather data from the network, decipher the bits, and display the output in a semi coherent fashion. The semi coherent output becomes fully coherent output with a little explanation and exposure to the tool.
Where Do I Get TCPdump and Its Variants?
You can download TCPdump from ftp://ftp.ee.lbl.gov/tcpdump.tar.Z
You need to download software known as libpcap, which implements a portable framework for capturing low-level network traffic. You can find it at ftp://ftp.ee.lbl.gov/libpcap.tar.Z
This is the “official” version of TCPdump; Lawrence Berkeley Labs authored it. Yet, more recently, a collective effort has arisen to maintain and improve the code. More feature-rich versions are being developed and can be found at www.tcpdump.org Windump is a Windows variant of TCPdump. You can download it from http://netgroupserv.polito.it/windump.
It also requires winpcap software to function. You can obtain winpcap from this same site.
After TCPdump has been installed, most operating systems require root access to run it. This is because reading packets requires access to devices accessible to root-only. TCPdump is run by issuing the command tcpdump. By default, this reads all the traffic from the default network interface and spews all the output to the console. This is not always the behavior the user wants; in fact, this is pretty irritating because records are likely to fly by uncontrollably on a busy network. Therefore, many different command-line options are available to alter the default behavior.
Suppose, for instance, that you don’t want to collect all the traffic from the default network interface. Maybe you are interested only in TCP records. TCPdump has a filter that enables you to specify the records that you are interested in collecting. TCPdump comes complete with a filter “language” to denote the field(s) in an IP datagram that should be examined and retained if the specified conditions are met. To collect only TCP records, issue the command tcpdump ‘tcp’. The filter in this example is ‘tcp’.
Filters get much more complicated and restrictive than this simple one when you use combinations of fields and traits. Just about any field in an IP datagram, including the actual data payload, can be used to limit the purview of collected records. It seems logical that TCPdump should include a way to indicate that the filter is stored in a file so that users don’t have to type a long filter complete with ham-handed keystrokes on the command line itself.
And true to logic, TCPdump has an –F filename option to indicate that the filter is located in the file filename.
As mentioned earlier, TCPdump dumps all the collected output to the screen. This is tolerable behavior if you are looking for a specific record. Most times, however, TCPdump is running in unattended mode, gathering records for retrospective analysis. To gather data for retrospective analysis, you want TCPdump to collect the records in a binary format, also known as raw output. When TCPdump displays records on the console, they have been translated from the native raw output format to a human-readable format.
For retrospective analysis, the desired format for storage is the binary mode, in which all captured data is stored, not just the data translated for output. To collect in raw output mode, use the command tcpdump –w filename, in which filename is the name of the file to which the records will be written in binary format.
To read this raw output file, another command-line option is necessary: tcpdump –r filename. This option reads input to TCPdump from filename rather than from the default network interface.
You can read a file that has been written using the –w option only by using TCPdump with the –r option. If you have ever used the UNIX tar utility, you know that when you create a tar file, often referred to as a tarball, you must read that same tar file using tar.
The same principle applies to TCPdump.
Altering the Amount of Data Collected
One final option is discussed before proceeding because it determines the amount of data that TCPdump collects. TCPdump does not attempt to collect the entire datagram sent. The reason for this is due to volume concerns and many times the user’s interest is in the header portions of the datagram that are usually collected with the default length. The snapshot length, sometimes known as snaplen, determines the exact number of bytes collected. One of the most common lengths of collected data is 68 bytes.
What exactly do you get with these 68 bytes of data?
The image below shows a sample breakdown of a packet. The header fields can be different lengths than depicted, based on the protocol and header options. First you have an encapsulating link layer header—if this were Ethernet, it would represent 14 bytes of Ethernet frame header with fields such as source and destination MAC addresses.
Next, you have an IP datagram header, which is minimally 20 bytes if there are no IP options. The encapsulated protocol header (TCP, UDP, ICMP, and so on) follows that and can range from 8 bytes to more than 20 bytes for TCP headers with options. The data, or payload in the datagram, is collected after all the headers.
As you can see, there might not be much, if any, payload collected because of the default snaplen. To alter the default snaplen, use the tcpdump –s length command, in which length is the desired number of bytes to be collected. If you want to capture an entire Ethernet frame (not including 4 bytes of trailer), use tcpdump –s 1514.
This captures the 14-byte Ethernet frame header and the maximum transmission unit length for Ethernet of 1500 bytes.
You can use many more command-line options with TCPdump. To learn about them, issue the command man tcpdump command. Be warned, however, that the output is copious (change the printer cartridge and restock the paper), but very informative if you have the patience and curiosity to wade through it.
One of the hardest tasks for the novice analyst to master is decrypting TCPdump output. TCPdump output is fairly standard for the different protocols (TCP, UDP, ICMP, for example), but does have some nuances. The first step is to identify the protocol that you are examining.
TCP output will be used to explain the general TCPdump format.
Here is a TCP record displayed by TCPdump:
09:32:43:910000 nmap.edu.1173 > dns.net.21: S 62697789:62697789(0) win 512
- 09:32:43:9147882 – This is the time stamp in the format of two digits for hours, two digits for minutes, two digits for seconds, and six digits for fractional parts of a second.
- nmap.edu – This is the source host name. If there is no resolution for the IP number or the default behavior of host name resolution is not requested (TCPdump -n option), the IP number appears and not the host name.
- 1173 – This is the source port number, or port service.
- > – This is the marker to indicate a directional flow going from source to destination.
- dns.net – This is the destination host name.
- 21 – This is the destination port number (for example, 21 might be translated as FTP).
- S – This is the TCP flag. The S represents the SYN flag, which indicates a request to start a TCP connection.
- 62697789:62697789(0) – This is the beginning TCP sequence number:ending TCP sequence number (data bytes). Sequence numbers are used by TCP to order the data received. For a session establishment such as this, the beginning sequence number represents the initial sequence number (ISN), selected as a unique number to mark the first byte of data. The ending sequence number is the beginning sequence number plus the number of data bytes sent within this TCP segment. As you see, the number of data bytes sent for a session establishment request is usually 0. That is why the beginning and ending sequence numbers are the same. Normal session establishments do not send data.
- win 512 – This is the receiving buffer size (in bytes) of nmap.edu for this connection.
Normal TCP connections have one or more flags set. Flags are used to indicate the function of the connection. The next table shows the TCP flags, their representation in TCPdump, and their meanings.
|TCP Flag||Flag Representation||Flag Meaning|
|SYN||S||This is a session establishment request, which is the first part of any TCP connection.|
|ACK||ack||This flag is used generally to acknowledge the receipt of data from the sender. This might be seen in conjunction with or “piggybacked” with other flags.|
|FIN||F||This flag indicates the sender’s intention to gracefully terminate the sending host’s connection to the receiving host.|
|RESET||R||This flag indicates the sender’s intention to immediately abort the existing connection with the receiving host.|
|PUSH||P||This flag immediately “pushes” data from the sending host to the receiving host’s application software. There is no waiting for the buffer to fill up. In this case, responsiveness, not bandwidth efficiency, is the focus. For many interactive applications such as telnet, the primary concern is the quickest response time, which the PUSH flag attempts to signal.|
|URGENT||urg||This flag indicates that there is “urgent” data that should take precedence over other data. An example of this is pressing Ctrl+C to abort an FTP download.|
|Placeholder||.||If the connection does not have a SYN, FIN, RESET, or PUSH flag set, a placeholder (a period) will be found after the destination port.|
TCPdump output for TCP is unique; the flag field and the sequence numbers are distinguishing characteristics. When you see these telltale signs in the TCPdump output, you know the record is TCP.
UDP records are likely to have the word udp in the TCPdump output. Although true most of the time, just when you think you can rely on this as a steadfast way to identify UDP output, TCPdump throws you a curve ball. TCPdump analyzes some UDP services, such as Domain Name Service (DNS) and Simple Network Management Protocol (SNMP), at the application level in addition to the protocol level as UDP.
Like Ethereal, it is protocol aware and can interpret normally coded payloads of certain protocols. The output might look foreign to you the first few times you see it because it does not have the word udp and because there are no TCP trademarks such as flags or sequence numbers. Typically, this is UDP output with more detail. Finally, ICMP is easily identified because the word icmp appears, without exception, in the TCPdump output.
Absolute and Relative Sequence Numbers
Not to belabor the discussion of TCPdump output any more than is necessary, but TCP sequence numbers need to be addressed in a little more detail.
Sequence numbers are associated only with TCP output, as just discussed. TCP sequence numbers are used by the destination host to reassemble TCP traffic that arrives. Remember that TCP guarantees order, whereas UDP does not. The sequence numbers are decimal number representations of a 32-bit field, so they can be pretty monstrous in size and intimidating to read.
TCPdump helps make the output more coherent by changing from the absolute ISNs to relative sequence numbers after the two hosts exchange their ISNs.
Look at the following TCPdump output. The time stamp has been omitted for the clarity and space-saving considerations:
client.com.38060 > telnet.com.telnet: S 3774957990:3774957990(0) win 8760 <mss 1460> (DF)
telnet.com.telnet > client.com.38060: S 2009600000:2009600000(0) ack 3774957991 win 1024 <mss 1460>
client.com.38060 > telnet.com.telnet: .ack 1 win 8760 (DF)
client.com.38060 > telnet.com.telnet: P 1:28(27) ack 1 win 8760 (DF)
The first two numbers in the first two lines in bold represent the very large ISNs in absolute format that are exchanged from client.com and telnet.com, respectively. The third line has a number in bold that represents a relative sequence number – 1. This means that client.com has acknowledged receiving the previous SYN by telnet.com with an ISN of 2009600000. The 1 as the acknowledgment value means that the next expected relative byte to be received by client.com is byte 1. That would have an absolute sequence number of 2009600001, if it were not displayed as a relative sequence number.
The final line has the numbers 1 and 28 in bold to indicate that relative to the absolute sequence number of 3774957990, the 1st byte through (but not including) the 28th byte are sent from client.com to telnet.com. The final line also has ack 1. This acknowledgment number will not change until telnet.com sends more data.
If you ever need to leave the sequence numbers in their absolute form, the TCPdump –S option will alter the default behavior of expressing TCP sequence numbers in relative terms after the exchange of the ISNs.
Changing the TCPdump Collection Interface
You might find that you want to read TCPdump traffic from a different interface than the default one. The default interface is the lowest number active one, not including the loopback interface. For instance, if you were on a Linux box and had two NIC cards, one might be known as eth0 and the next eth1. To change the default interface, the –i option of TCPdump is used.
The following command will select ppp0 as the listening interface:
tcpdump –i ppp0
Dumping in Hexadecimal
TCPdump does not display all the fields of the captured data. For example, the IP header has a field that stores the length of the IP header.
How do you display this field if it is not available from the standard TCPdump output?
There is a TCPdump command-line option (–x) that dumps the entire datagram captured with the default snaplen in hexadecimal. Hexadecimal output is far more difficult to read and interpret, but it is necessary to display the entire captured datagram.
To interpret TPCdump hexadecimal output, you need some reference material that discusses the format of the IP datagram headers and describes what each of the fields represents. (One such reference title is TCP/IP Illustrated, Volume 1, by W. Richard Stevens.) You then must translate hexadecimal to decimal for numeric fields and numeric to ASCII for character fields.
Ethereal is probably the best tool to use for translation of TCPdump records that are stored in binary form with the –w tcpdump command line option; it can read TCPdump binary data as input.
TCPdump for Windows
We could not finish our tcpdump guide without referring to the Windows version of this tool.
TCPDUMP for Windows is a clone of TCPDUMP for UNIX, compiled with the original tcpdump code (tcpdump.org), and a Microsoft packet capture technology Microolap Packet Sniffer SDK (no libpcap/WinPcap/npcap).
List of the Windows OS supported by Microolap TCPDUMP for Windows:
- Windows XP
- Windows Vista
- Windows Server 2003
- Windows Server 2008
- Windows Server 2012
- Windows 8
- Windows 10
- Windows Server 2016
- Windows Server 2019
(at this time, the app was still not available)