Transport layer overview
The transport layer provides end-to-end communication services at the application layer. Generally speaking, the communication between two hosts, that is, the communication between processes on the application layer, is transformed into communication between processes. We learned about the network before layer, the IP protocol can accurately send the packet to the destination host, but it stays at the network layer, and does not know how to hand it over to our host application process. Through the previous study, we learned that there is a mac address, through which we can find the same A host under a network has an IP address. Through the IP address, the network under different networks can be found, and the corresponding host can be found by combining the mac address. Then how to find the host application process, there must be something to identify it, that is what we often say port . _
It occupies 16 bits, and its size is 65536, which is from 0 to 65535. That is, a computer has 65535 ports. The communication between hosts, that is, the communication between application processes, depends on ports. A process Corresponding to a port, process A communicates with process B. The port assigned to process A is 60000, and the port assigned to process B is 60001. When process A sends data to process B through port 60000, it knows that it will be handed over to port 60001, that is, In process B, the purpose of communication is achieved.
Well-known port, registration port, client port
Familiar ports: 0-1023, that is, some fixed port numbers, such as port 80 used by http, which means that when accessing the website, the port we access the server is 80, and then the server transmits the data of the web page to us.
Register port: 1024-49151. For example, Microsoft has developed a system application that needs to use the xxx port when communicating or using it. Then you must register this port to prevent other companies’ applications from using the same port number. For example, port 3389 in the Windows system is used to realize remote connection, and this computer is fixed. If you want to use the remote connection service, open port 3389, and others can use remote connection to connect to you. It is not opened by default. .
Client port: 49152-65535, generally we use a certain software, such as QQ, and other services, randomly take the port in this range, instead of taking the fixed ones in front, and release the port after the communication is over. port.
The transport layer is the medium that connects the two ports for communication. Otherwise, if you only know what the two ports are used for and how to communicate, you still have to rely on the transport layer to do this. The most important thing is to rely on two protocols, UDP and TCP. .
UDP: User Datagram Protocol User Datagram Protocol, connectionless, unreliable
No connection: It means that there is no need to establish a definite connection before communication, and directly transmit data.
Unreliable: It is to send packets of datagrams from one host to another, but there is no guarantee that the datagrams can reach the other end, and any necessary reliability is provided by the application. In the case of UDP, although the size of the sent message can be guaranteed, it cannot be guaranteed that the message will reach the destination. There is no timeout and retransmission function. When UDP data is encapsulated into IP datagram for transmission, if it is lost, an ICMP error message will be sent to the source host. Even in the event of network congestion, UDP cannot perform flow control. In addition, even if there is packet loss during transmission, UDP is not responsible for retransmission, and there is no correction function even when the order of arrival of packets is disordered.
UDP header format
Source port number: 16 bits, the port number used by the application process of the source host
Target port number: 16 bits, the port number used by the application process of the target host, that is, the target process we need to communicate with
UDP packet length: the length of UDP user datagram, the sum of data part + UDP header is the length of UDP packet.
Checksum: The checksum is designed to provide reliable UDP headers and data. Don’t confuse it with the unreliable transmission above. The reliable UDP headers are provided here because a process may accept messages from multiple processes. So how to distinguish them, it is distinguished by 5 things, "source IP address", "destination IP address", "protocol number", "source port number", "destination port number", this detection is reliable, It is to detect which correct message to accept, that is to say which message is going to enter this port, which is unreliable. It means that this message may be lost, and we don’t care if the data in it may be damaged, but the premise of these is that you must be transmitted to the correct destination.
It is to get some data of the IP layer, because these data are necessary for the checksum. The algorithm for checking is the same as the method for checking the header in the IP layer.
In a target process, the packets, target port, and target IP address must be the same, but the source IP address and source port may be different, which means that packets from different sources but the same destination will be located at the same queue. This is different from the TCP we will discuss next, because UDP is connectionless, and everyone uses this channel, so the above-mentioned situation will appear in its queue.
Example using UDP protocol:
In the application layer protocol, DNS is a protocol for resolving ip addresses based on domain names. It uses UDP
DHCP, this is a protocol for assigning ip addresses to each computer, which also uses the UDP protocol
IGMP, what we call multicast, is the UDP used. In the multimedia teacher, the teacher takes a notebook to give a lecture. We see the teacher's picture through our respective computers below. This is the transmission of data through UDP, so some students will appear. , some students are very smooth because of its unreliable transmission, but once it gets stuck, it has no effect on the next viewing
The TCP protocol is a connection-oriented, reliable transmission, flow control, congestion control, byte stream-oriented transmission and many other advantages. Its final function is the same as UDP, communicating between terminals, but it is still very different from UDP.
Structure of a TCP packet
source port number
target port number
Serial number: Because TCP is byte-oriented, it will divide the message into bytes and write a serial number for each byte. For example, if a message consists of 900 bytes, it will be compiled into 1-900 serial numbers, and then transmit in several parts. For example, for the first transmission, the serial number is 1, and 50 bytes are transmitted, then the second transmission, the serial number is 51, so the serial number is transmitted The position of the first byte of data relative to all bytes.
Confirmation response: As in the example just mentioned, if you send 50 bytes to the other party for the first time, the other party will respond to you with a confirmation response, which tells you that the 51st byte will be transmitted next time, so this The confirmation response is to tell the other party how many bytes to transmit
Header length: the length of the header
Reserved: for future use, the reserved position is similar to the control bit
Control bits: currently there are 6 control bits
URG: Urgent. When URG is 1, the urgent pointer field of the table name is valid, indicating that the message is an urgent message. After it is transmitted to the target host, there is no need to queue up. The message should be queued as far as possible to let it pass as soon as possible. Applications are accepted.
ACK: confirmation, when ACK is 1, the confirmation sequence number is valid. When ACK is 0, the confirmation sequence number is useless
PSH: push, when it is 1, when encountering this message, it will reduce the data delivery upwards. Originally, the application process will wait for a certain cache size to send the data, but when it encounters it, there is no need to wait enough More data is delivered upwards, but to let the application process get this message earlier. This must be clearly distinguished from emergency. Emergency is to jump in the queue, but the submitted data of the cache size remains unchanged. This push will queue up, but when it encounters his At this time, the delivered cached data will be reduced and delivered in advance.
RST: reset, when the message encounters a serious error, such as a TCP connection error, etc., it will set RST to 1, then release the connection, and start all over again.
SYN: Synchronization, used when connecting, that is, three-way handshake, will be mentioned in detail below, used together with ACK
FIN: Termination, used when releasing the connection, that is, when waving four times.
Window: refers to the receiving window size of the sending party, which is used to control the amount of data sent by the other party (starting from the confirmation number, the amount of data allowed to be sent by the other party). That is, the window size of the sliding window that needs to be mentioned later
Checksum: Check the two parts of the header and the data. Like UDP, you need to get the data in the fake header to help detect
Option: The length is variable, introducing an option, the maximum segment length, MSS. Can tell the other party TCP that the maximum length of the data field of the message segment that my cache can accept is MSS bytes. If no option is used, the header is fixed at 20 bytes
Padding: just to make it an integer number of bytes
(Three-way handshake): Before communication, the mechanism of three-way handshake will be used to confirm whether the connection between the two ports is available. While UDP does not need to confirm whether it is available, it can be transmitted directly.
Three-way handshake mechanism
At the beginning, both the client and the server are closed, but at a certain point, the client needs to communicate with the server. At this time, both parties will prepare their own ports, and the port on the server segment will be in a listening state, waiting for the connection from the client. The client may know its own port number and the port number of the destination process, so that it can initiate a request.
The first handshake: the client wants to connect with the server, so the status becomes active open, and at the same time, a connection request message is sent to the server with SYN=1, and it will carry x bytes in the past. After sending the request connection message, the state of the client becomes SYN_SENT. It can be said that this state is waiting for sending confirmation (in order to send the confirmation packet during the third handshake)
The second handshake: After receiving the connection request message, the server changes from the LSTTEN state to the passive open state, and then returns a message to the client. This message has two meanings, one is to confirm the message, and it can tell the client that I have also opened the connection. After sending, it will change to SYN_RCVD state (it can also be said to be waiting to accept the confirmation state, accepting the confirmation packet sent by the client)
The third handshake: After the client gets the confirmation from the server and knows that the server is ready to connect, it will send a confirmation message to the server, telling the server that I have received the message you sent, and then Let's connect the two of us. After the client sends the confirmation message, it enters ESTABLISHED, and when the server receives it, it also becomes ESTABLISHED. After entering the ESTABLISHED state, the connection has been completed and communication can be performed.
Question: Why do we need the third handshake? Isn't it enough to have the previous two?
Assuming there is no third handshake, the client sends a connection request message, but due to network delay, after waiting for a timeout period, the client will resend a request connection message, and then proceed normally, the server The end sends back a confirmation connection message, and then the communication starts. After the communication is over, the first request connection message due to network delay reaches the server. The server does not know that this message has expired, and also sends back a confirmation. After receiving the connection message, the client finds that it has not sent a connection request (because it has timed out, so it thinks that it has not sent it), so it does nothing to confirm the connection request, but the client does not think so at this time. He thinks that the i connection has been established, so he keeps it open and waits for the client to transmit data, which causes great waste. If there is a third handshake, then the client can notify the server. So the third handshake is also very important.
Simultaneously open connection requests
Under normal circumstances, one party to the communication requests to establish a connection, and the other party responds to the request, but if it occurs that both parties in the communication request to establish a connection at the same time, the connection establishment process is not a three-way handshake process, and there is only one connection in this case, not Two connections will be established. When the connection is opened at the same time, both sides send SYN almost at the same time, and enter the SYN_SENT state. When each end receives the SYN, the state changes to SYN_RCVD, and both parties send SYN and ACK again as a confirmation response to the received SYN. When both parties receive SYN and corresponding ACK, the state becomes ESTABLISHED
achieves the purpose of reliable transmission through four aspects: data numbering and cumulative confirmation, sliding window in bytes, timeout retransmission time, and fast retransmission.
Data number: number each byte, if there are 900 bytes, number from 1 to 900
Cumulative confirmation: the server does not send a confirmation after receiving one byte, which is too inefficient, but when it receives 4 or 5 bytes, it sends a confirmation, then the data before the previous confirmation is sent successfully .
Sliding window: This is the same as talking about a sliding window at the data link layer. The data that can be sent each time is in this window, as much data is received, slide back as much data
Timeout retransmission time: This is also mentioned at the link layer. If the confirmation message has not been received after waiting for a period of time, then retransmit
Fast retransmission: For applications in sliding windows, for example, if 12346 is transmitted to the server, the old method is to retransmit all data after 4, and this fast retransmission only needs to wait for the serial number 5 to be transmitted. Continue to receive data.
is in the transport layer, there are two things, the receiving cache and the sending cache, so every time data is sent to the other end, the data will be brought over to let the other party know the size of the two caches. Then set the size of your own sending window reasonably. If the other party's cache is almost full, the other party will tell yourself when sending data over, and you should set the sliding window to be smaller so that the other party has more time to send data. Buffering opportunities without causing buffer overflow and not letting your own packets be discarded.
is actually similar to flow control, but the perspective of the station is larger. At this time, it is considered that the other party cannot receive it, and the buffer overflows, and it is also considered that the transmission rate on the line is so large, but there are many people. At the same time, if too much data is sent, the line will be congested, that is, the router may not be able to forward it, resulting in a large amount of data loss. These two problems. So the solution of congestion control probably means that when network congestion is detected, it will make its own sliding window smaller, but how it changes is based on the algorithm. The upper limit of the sending window = Min [rwnd, cwnd]
rwnd: receiving window, according to the receiving buffer, the receiving window is determined, and there are many receiving buffers, so the receiving window will be larger
cwnd: Congestion window, determined according to the congestion situation in the line, if there is no congestion in the line, then the window will be larger, and the
sending window is the smaller value of the two. This is still understandable. Slow start algorithm, fast recovery algorithm, combined to achieve congestion control
Four waves when TCP releases the connection
The first wave: from ESTABLISHED to active close state, the client actively sends a release connection request to the server, FIN=1. After sending, it becomes FIN_WAIT_1 state, which can be said to be waiting for confirmation state.
The second wave: After the server receives the connection release request from the client, the state changes to CLOSE_WAIT, and then sends a confirmation message to the client, telling him that I have received your request. Why it changed to CLOSE_WAIT? The reason is that the release connection request sent by the client may not have finished sending the data, so the status of the entire TCP connection becomes half-closed at this time. The server can still send data, and the client can also receive data, but the client can no longer send data, and can only send confirmation messages. After the client receives the confirmation message from the server, it enters the FIN_WAIT_2
state. It can also be said that this is waiting for the server to release the connection state.
The third wave: after sending all the data on the server side, it thinks that the connection can be closed, and the state becomes passively closed, so it sends a release connection message to the client, and after sending it, it changes to the LAST_WAIT state, that is, it waits for the client confirm status
The fourth wave: After receiving the release connection message, the client sends a confirmation message, and then changes to TIME_WAIT instead of closing immediately, because the confirmation message sent by the client may be lost, and the server will restart if it is lost. Pass a FIN, that is, release the connection message, and the client must not be closed at this time. When the server receives the confirmation message, the server enters the CLOSE state, that is, it is closed. However, due to the reason mentioned above, the client must wait for a certain period of time before it can enter the CLOSE state.
while closing the connection
Under normal circumstances, one party to the communication requests the connection to be closed, and the other party responds to the connection close request and passively closes the connection. However, if there is a request to close the connection at the same time, both parties in the communication will transition from the ESTABLISHED state to the FIN_WAIT_1 state. After any party receives the FIN message segment sent by the other party, its state changes from FIN_WAIT_1 to CLOSING state, and sends the final ACK data segment. After receiving the last ACK data segment, the state changes to TIME_WAIT, and enters the CLOSED state after waiting for 2MSL time, and finally releases the entire TCP transmission connection. The process is as follows