• notice
  • Congratulations on the launch of the Sought Tech site

In-depth understanding of the browser navigation process through Chrome

The navigation of the network is the process from inputting the url to finally obtaining the file. Which involves a series of knowledge of browser architecture, operating system, network and so on. This article will discuss this process in detail from various angles, involving breadth and depth. If you are a classmate who already has a certain foundation, then this article can quickly lead you to systematically organize fragmentation knowledge.

Navigation

In this section, we will request as a common starting point, the packet roaming throughout the OSI model following a request, this section catalog:

  • Navigation

    • Repeater

    • Hub

    • Bridge

    • switch

    • Gateway

    • Transport layer

    • TCP / IP protocol suite lower layer

    • ARP

    • ICMP

    • DNS hierarchy

    • Hosts

    • DNS resolution process

    • Resolve URI

    • Build request

    • Find strong cache

    • DNS resolution

    • Protocol stack

    • Network card and driver

    • Electric signal escort

    • Recurse to the opposite end

Resolve URI

When we need to request a Web site address in the address bar, such as: 晨风and press Enter,
Chrome will first parse the contents, determine which is URLstill searching for content, if the content is automatically search for URLencoding and stitching as the default search engine params.

If so URI, such as: test.com, processing URI, add httpand access the default 80port number.

file

In the Chromelevel, if you have the address bar to show the original page, after the above operation, it will trigger the current page beforeunloadwith the unloadevent. Meanwhile browser tab into the loadingicon status, a new page has two important time node, will be described in detail in the rendering chapter:

  • interactive: It represents the browser has been completed HTML parser, Recalculate Style, Layout Tree, Render Tree, draw listand so on.

  • complete: It means that the browser has finished rendering the page, which will replace the original bitmap of this window and display the latest interface. In interactivethe completebetween, is the work of the position of the synthetic threads in the rendering process, Chromethe rendering process based on skia be 2Ddrawn interface elements.

Earlier some sites will URIdirect the development path and specific file suffix, such as: https://www.test.com/home/index.html. But this brings, such as unauthorized access and other security issues with the explosive growth in demand for Internet services, people Websafety and efficiency have higher requirements, so the introduction of the proxy server to meet to ensure the security , load balancing , caching proxy and other needs. Almost all modern Web uses proxy servers to hide the real resource location.

Build request

By URI Checkpost, Chromeyou need to create it as a getrequest, before this tell us about Chromethe architecture components.

ChromeIs currently used in SOAarchitecture, the main characteristics of the different applications of the Servicesplit, and linked to well-defined interfaces and protocol between these services. The commonly used processes are as follows:

  • Browser main process: responsible for page display, user interaction, sub-process management and other functions

  • Rendering process: Each tab has its own rendering process, regardless of whether it is a same-site site , SandBoxoperating environment, processing HTML, CSSor JavaScript. At the same time V8and Blinkalso run in the process.

  • Plug-in process: responsible for running plug-ins, according to the function of deciding whether to run the plug-in Sandboxenvironment

  • GPU process: handle some special CSS effects

  • NetWork Service: Process network resource loading, request response, and verify CORS.

  • Storage Service: processing localStorage, sessionStorage, cookie, Indexed DBstorage control.

  • Audio Service: audio and video processing Buffervolume playback operation

  • V8 PAC tool: Use V8 to parse PAC files and do what you know.

From the above that Chromethe main process needed by IPCthe task of building a request entrusted to NetWork Servicebe responsible for this task.

NetWork ServiceAccept the task, creates getthe request, wherein the request line requests the HTTP method + + version number of the request path compositions; request header provided by the built-Chrome.

file

Introduced in HTTP 2.x standard Hpackand Stream, where Hpackthe main purpose is to compress the packet header information request to reduce the transmission of redundant data for each link. It will integrate packet header information into one Hash Table, and the use of Huffmancompression coding the text. And request line has been canceled, into the contents of Hash Tablethe header and at :the beginning, in order to distinguish the request header and request line. StreamThe role of we will introduce later

Find strong cache

NetWork ServiceCommissioned Storage Servicein sequence service work cache, memory cache, disk cache, push cache(HTTP2 Stream)find the corresponding URIwhether there is a strong cache available, if there is a strong cache, the cache directly into the browser parses links, otherwise enter the DNSresolution.

In order to facilitate the students to learn and verification, I put the cache resources non-memory cache location in MacOS statistics are as follows:

  • Service Work Cache:/Users/YOUR_NAME/Library/Application Support/Google/Chrome/Default/Service Worker/[CacheStorage || ScriptCache]

  • Disk Cache:/Users/YOUR_NAME/Library/Application Support/Google/Chrome/Default/Application Cache/Cache

Because Chrome canceled by the chrome://cacheneed to install decompiling tool to view yourself when accessed, so see this kind of problem.

Large files will be stored in a default under normal circumstances disk cache, the small files stored memory cache. But when the memory usage is high, it will be put in priority if the pressure needs to be relieved disk cache.

HTTP 2 provides multiplexing, header compression, and Service Push. Among them, Service Push is the only function that needs to be implemented manually.Service Push can return relevant data in a stream that has not been actively requested by the user, so as to save unnecessary overhead on the message.

DNS resolution

If the strong buffer does not exist or expires, NetWork Servicecontinue to send the message to the receiving end. This requires OScoordination, you first need to give the message entrusted OSto the protocol stack, but OSis not recognized corresponding to the packet domain, and therefore can not provide help. We must provide the IPaddress. Will develop domain names into IPwork by DNSproviding server.

The birth of domain names is also to conform to people's habitual memory.No one likes to remember meaningless IP addresses. So there is a DNS service to empower the domain corresponding to the IP to facilitate memory.

DNS hierarchy

Since the invention of the Domain Name System is a foreigner, so DNS is hierarchical division from right to left according to **.**be segmented , it's like the English names like 根域 / 姓氏taking the rearmost part of the domain name, do not meet people remember habits.

According to the level of DNS server, it is divided into:

  • Root domain DNS server: does not save specific domain name information, but it is the total entrance to all top-level domain DNS servers

  • Top-level domain DNS server: domain name suffixes represent different server, such as cn, com, techand so on. The same does not save specific domain name information, it is the total entrance to the authoritative DNS server for the corresponding suffix

  • Authoritative DNS server: As its name, it represents the authority of the corresponding Domain mapping IP. It is the real server that stores the mapping relationship.

file

From the graph shows that have similar between the DNS server trietree structure, the information of each layer of the tree is complete domain and part of the non-leaf nodes of information are not helpful . The leaf nodes are called authoritative server is IPthe Domainactual position relational mapping store. The information of the root domain DNS server is stored in all the DNS servers on the Internet.Because of this, the client only needs to access any DNS server to find the root domain server along with it to obtain the target IP.

Chrome has officially started DOH, or DNS-over-HTTPS, since version 83.The main purpose is to prevent the original DNS request from being easily tampered with by the middleman because it is HTTP plaintext transmission.Therefore, DOH is a DNS request with TLS.

Hosts

As JavaScriptthe Promisesupport thenable, instanceofsupport Symbol.hasInstance, JSON.stringifysupport toJSON, and so will offer a customized behavior entrance.

There is also a local customized entry for domain name resolution Hosts. It is a local association "database", will Domainthe IPcorresponding address. Parsing priority is greater than the DNSservice.

DNS resolution process

Author visit http://www.test.com, for example, DNS resolution process is as follows:

  • Check hostswhether the storage target domainand IPaddress mappings, if found directly back to the client.

  • If hostsno corresponding domainclient establishment DNSrequest, queries the local DNSserver Domaincorresponding to the IPaddress.

  • Local DNSserver receives a request, first check DNSthe cache can find a domaincorresponding IPaddress, if found directly back to the client. If DNSthe cache does not exist, find their root domain recorded DNSaddress inquiries and requests the root domain server initiates Domaina corresponding IPaddress.

  • Root domain server does not save specific data, but pointed out the goal of our next inquiry: the corresponding comtop-level domain name server address.

  • Local DNSserver receives the response from the root domain, inquiries continue comtop-level domain name service.

  • The same top-level domain name server returns the corresponding test.comauthoritative server address.

  • The local server continues to inquire the authoritative server, which is the original source of the domain name resolution result and the last inquiry.

  • Authoritative DNSserver returns a domain name corresponding IPaddress to the client.

  • Local DNSserver cache results. Will be IPissued OS.

  • OSReturn IPto Chrome NetWork Service.

This case NetWork Servicehas a green card, have everything. And finally by socket librarythe data entrusted to OSto enter the protocol stack it. But also marks leaving OSIthe application layer.

Protocol stack

Request packet OSentering the help of the protocol stack. Work in the application layer and the intermediate transport layer handles the protocol stack corresponding to H2a Hpackand Stream, if domainused TLS / SSLprotocol, OSwill be selected from the local cipher suite cipher suite list, and the information is added to the data packet.

Transport layer

At this point, the data packet comes to the upper layer of the protocol cluster, which represents the general name of the protocol that works in the transport layer and the network layer. The protocol cluster is divided into upper and lower parts, which respectively undertake different tasks and have certain rules for the relationship between the upper and lower layers.After the upper layer completes part of the work, it will entrust the lower layer to continue execution. The first thing that catches the eye in the upper-layer protocol cluster is the one responsible for sending and receiving data packets.TCP / UDP

TCP

TCP is a reliable, stateful, and byte stream-based protocol for one-to-one links. Before HTTP data is transmitted, a TCP connection is first required.The establishment of a TCP connection is usually called a three-way handshake.

Before we introduce TCP in depth, we must first understand MTUthat it is the maximum length of a network packet, which is generally 1500 bytes in Ethernet. And our HTTP data table Duchang is likely to be greater than 1500, so we need to slice and send the excess content, TCPsegment the message and add some information to ensure that each data packet can reach the receiving end smoothly. TCPThe maximum length of data MSS, which by MTU- TCP head- IP headcalculated from. At this point we introduce TCP Headinformation specific to add.

file

Source port, destination port

The first is a set of port numbers.Without them, after the data packet arrives, it is not known which port it belongs to.
At the same time, we also use the source IP, source port, destination IP and destination port to form a unique identification.

At this time, some students may ask questions.When the browser opens multiple tabs, if the domain name and port accessed are the same, how does the data correspond to the correct label? The author inferred that Chrome may be identified by TCP timestamp or ISN.If any students can provide accurate answers, please point it out.

Sequence Number

Abbreviation seq, it represents the serial number of the first byte of this message segment.The serial number is a 4-byte long, that is, it can represent a 32-bit unsigned integer. If it reaches the maximum value, it loops to 0. It mainly has the following functions

  • Ensure that the end has a flag for sending function.

  • Exchange ISN when sending SYN message for the first time.

  • Ensure that the split data packets are assembled in the correct order.

ISN is the Initial Sequence Number, which is exchanged through the first two handshake in the three-way handshake.Its purpose is to prevent criminals from forging IP and Port after knowing the ISN to illegally attack the link through the TCP flag. Since the current ISN is not a fixed value, it is increased by one every 4 ms, and the overflow returns to 0. This greatly increases the difficulty for the attacker to guess the ISN.

Acknowledgment Number

Referred to ack, and seq, like, occupy 4 bytes, the specific contents of less than this represents bytes have been received. It mainly has the following functions

  • To ensure that the end has the ability to receive

  • Inform the sender to expect the start position of the data sent next time

Mark bit

According to the different types of TCP message processing information, a certain mark needs to be given, which is the mark bit.
There are common flag SYN, ACK, FIN, RST, PSH. This aspect is relatively basic, students who are not clear can combine the three-way handshake to learn more.

Window size

It empowers TCP to do flow control, and both parties in communication declare a window (buffer size) to identify their current processing capabilities. This is also called the initialization window.
In addition to using the sliding window for flow control, TCP also uses the congestion window for congestion control, and uses the initialized congestion window to take slow start, fast retransmission and fast recovery, and congestion avoidance.

Checksum

Occupies two bytes, the data packet transfer process to prevent damage if it encounters an error checksum packets TCPwill be discarded by the return ackvalue remains unchanged to alert the transmitting side need to be retransmitted.

Emergency pointer

This is to cope with some emergency situations (such as forced interruption in certain connections)
in some applications that require the receiver to be able to send some urgent data before the data has been processed.

Options

This is an option in TCP, the most important of which is

  • TimeStamp: TCPtimestamp, resolve RTTconfusion with the serial number rewind

  • MSS: previously mentioned, by MTU- TCP head- IP headcalculated from.

Going back to the three-way handshake, the so-called establishment 连接is just to maintain a state machine in the computers of both parties.During the connection establishment process, the state of the two parties is changed close → established. Through the three-way handshake SYN, the ACKtransmission ensures the sending and receiving capabilities of both parties.

file

In Linux we can netstat -naptsee the TCP link status:

tcp        0      0 0.0.0.0:5440            0.0.0.0:*               LISTEN      9138/javatcp     1070      0 199.161.10.251:9020      199.161.10.251:34512     CLOSE_WAIT  4122/javatcp        1      0 199.161.10.251:60254     199.161.100.195:38399    CLOSE_WAIT  7377/javatcp     1076      0 199.161.10.251:9020      199.161.10.251:34540     CLOSE_WAIT  4122/javatcp      416      0 199.161.10.251:9020      199.161.10.251:39166     CLOSE_WAIT  4122/javatcp        0      0 199.161.10.251:36956     199.161.10.116:22        ESTABLISHED 7377/java

In the actual network environment, the data packet transmission is blocked at this time, and the three-way handshake is completed with the opposite end before sending data. This is also the main reason why people said that HTTP is based on TCP before HTTP3.0.At the same time, it can be seen from this that **TCP's head of line blocking** is an inevitable problem.

TCP also provides a keep-alive function, but it is very tasteless.

UDP

UCP is a connectionless, one-to-many sending and stateless protocol.

Because TCP is preconceived and its reliability has withstood the test of history, it is easy to believe that it will always maintain the dominant position of the web-side transport layer.However, the pioneering ability of the Google team has once again opened my eyes. The HTTP 3.0 standard will abandon TCP. UDP succeeded in gaining a dominant position.

The main reason surely we have already heard a little, TCPthe link must go through three-way handshake, even if you use TFO (TCP Fast Open)too. If you need to improve the security of data exchange, not only increasing the Transport Layer Security (TLS), ensuring the safety of Session Ticketalso increasing the need for optimization 1 RTT. We do not consider PSKit because it is not safe. In short, the cost of TCP protocol connection establishment is relatively high.Because TCP is implemented in the operating system kernel and middleware firmware (the protocol stack mentioned above), it is almost impossible to make major changes to TCP.

The UDP protocol is a connectionless protocol. After the client sends a UDP data packet, it can only "assume" that the data packet has been received by the server. The advantage is that there is no need to check the data packet at the transport layer, and it is generally used for some transmission of online games and streaming media data. In contrast, if you need to ensure the reliability of data transmission, the application layer protocol needs to confirm the packet transmission by itself. This agreement isQUIC

The QUIC protocol is a low-latency Internet transport layer protocol based on UDP. HTTP 2.0 solves the head-of-line blocking problem caused by HTTP, but the deeper TCP head-of-line blocking problem cannot be avoided.QUIC is based on the UDP protocol, so it completely solves all the head-of-line blocking problems.

The QUIC protocol can complete the creation of a connection within 1-2 RTTs (including supporting TLS) according to whether the connected server is new or known, which is very attractive.

Although QUIC has many advantages, it has not yet reached the stage of mass popularization.At present, some routes will block the port 443 where QUIC is located.Too many UDP packets make service providers mistakenly believe that it is an attack, and firewall support for QUIC is not in place. Let us look forward to the day when the QUIC protocol specification can be finalized and promoted.

This article is the current mainstream of TCP-based, after the TCP header, the current packet as follows:

file

TCP / IP protocol suite lower layer

In the transport layer to perform operations such as connection, sending and receiving, and disconnection, it is necessary to entrust the IP protocol to encapsulate the data packet into a network packet and send it to the communication partner. Let's take a look at the format of the IP packet header

file

The most important ones are the source IP and destination IP.

  • The source IP is the IP address of the current client

  • The target IP is the receiving server IP obtained by DNS domain name resolution

The second is the protocol number in the IP Header, which indicates the protocol used by the transport layer. Expressed in hexadecimal. For example, 06 means TCP. After the IP Header packaging, as follows:

file

Routing table

Through the IP Head, we know the destination IP of the received data, but we are not sure how far this IP is from our address location.In many cases, we may not be able to directly send to the opposite end, but have to transfer several times in the gateway and control This process is based on the rules of the routing table.

In the Linux system based on route -nthe current system's routing table view.

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface0.0.0.0         199.161.100.1    0.0.0.0         UG    100    0        0 eth0199.254.169.254 199.161.100.153  255.255.255.255 UGH   100    0        0 eth0199.161.100.0    0.0.0.0         255.255.252.0   U     100    0        0 eth0

This step is also the key point of divergence in whether the gateway is involved in the link process. Let's see how the routing table works.

  • First, take out each piece of information in turn according to the routing table list

  • According to the subnet mask (Genmask) in each message and the destination IP of the receiver, if the result matches the Destination, it means that the opposite end of the communication with us is in the same Ethernet, and the next transmission does not need to go.Gateway. And make sure to use the current IP as the IP header address.

  • If the above matching fails, the default gateway will be matched.Generally, this is the IP of the router.Finally, the network packet will be forwarded to the router for the router to help send it.

ARP

After the IP header is generated, the network packet needs to be added with the MAC header.The birth of IP is to manage the identity of computers in various Ethernets more conveniently. And every computer connected to all networks will have a network card interface, and each network card will have a unique address, which is called a MAC address. Data transmission between computers is uniquely searched and transmitted through MAC addresses. The structure of the MAC header is as follows:

file

Among them, the sender's MAC is very easy to confirm, because the MAC has been written into the ROM during the production of the network card, and this value can be directly read and written into the MAC header.
The MAC address of the receiving end is relatively complicated.At present, we already know the IP of the receiving end.Through the subnet mask, we can divide the receiving end into two categories.

  • Neighbors on the same subnet

  • For the communication objects in the external subnet, we hand it over to the neighborhood committee aunt (gateway) to communicate

It can be seen that no matter whether the communication object is a neighbor or not, the objects we send for the first time are all on the same subnet, perhaps the neighbor server or the gateway. Therefore, we use broadcast to query the target's MAC address.

broadcast

The ARP protocol will broadcast on the Ethernet to query the MAC address corresponding to the destination IP address matching the routing table for all the devices on the Ethernet.

It's like shouting in the playground, everyone can hear it, but if the object of the call is not yourself, you can stop responding. When the person being called out hears it, it responds with the MAC address.

file

ARP cache

Just like most services, ARP also has its own caching system to improve efficiency with space for time. After obtaining the MAC address, OSthe query result will be stored in a memory space called ARP cache for future use, but the cache time is only a few minutes.
In other words, when constructing the MAC header:

  • Query the ARP cache first.If the MAC address of the other party has been saved in it, you do not need to send an ARP broadcast query, and use the address in the ARP cache directly.

  • When the MAC address of the other party does not exist in the ARP cache, an ARP broadcast query is sent.

linux can be used arp -ato view the contents of the ARP cache

gateway (199.161.100.1) at 79:2c:29:11:0a:32 [ether] on eth0? (199.161.100.251) at ff:91:13:17:a0:00 [ether] on eth0? (199.161.101.189) at ff:63:8a:1f:83:00 [ether] on eth0? (199.161.100.153) at b3:ab:ef:43:1d:40 [ether] on eth0

After obtaining the MAC address of the recipient, reads the MAC address of the ROM card itself, into the MAC header of the current packet presented as:

file

ICMP

The ICMP protocol is an integral part of IP and must be implemented by each IP module.

It is mainly used to transfer control messages between IP hosts and routers. Control messages refer to messages about the network itself, such as whether the network is unreachable, whether the host is reachable, and whether the route is available.

Network card and driver

The network packet generated by the protocol stack is just a string of binary information in the memory, and it cannot be sent directly. Digital information needs to be converted into electrical signals.The bottom layer of the computer is actually a combination of various logic circuits, and the high and low voltages are changed through hardware such as transistors. After being converted into a digital signal, it can be transmitted on the network cable. This is the real process of sending data. At the same time, the part responsible for the network card is also calledEthernet Frame

file

The network card is responsible for performing this operation, but to control the network card, you must rely on the network card driver, which has built-in methods for the behavior of the network card. Specific steps are as follows:

  • After the network card driver obtains the network packet from the IP module, it will copy its binary information to the cache area in the network card. In order to distinguish this segment of data, we need a set of rules to transmit binary, such as how many electrical signals are in a group, how to identify the beginning and the end, and so on.

  • Therefore, we add the header and the start decomposer of the binary frame to the start position of the binary information to indicate the start position of the packet

  • Add FCS at the end of the data packet, also known as the frame check sequence, to check whether the packet is damaged during transmission.

  • Finally, the network card converts the packet into electrical signals, which are transmitted through physical media such as network cables and optical fibers.

Finally, the entire data frame appears as shown below:

Electric signal escort

Repeater

Since the electrical signal will attenuate continuously during the transmission process, in order to prevent the signal attenuation from affecting the communication quality, a repeater is produced. It only serves to amplify the signal and can transmit the signal to remote places.

Hub

We assume that both parties involved in the network link have only one network interface, and then only one-to-one communication can be established.From the previous article, we can also find that we need to have a one-to-many scenario such as broadcasting.Broadcasting is to copy the signal.This is The role of the hub, and can reshape and amplify the electrical signal. It works at the physical layer.

By the way, the difference between it and the switch is that it does not have the intelligent memory and learning ability of the switch, nor does it have the MAC address table of the switch. When it sends data, it is not targeted.It can be said that it is synonymous with broadcast transmission.

Bridge

Since there is a hub, it has solved the one-to-many efficiency problem, and it has also brought about problems.In a real network environment, there may be multiple hubs connected together, but because they are used for broadcast communication, they will communicate with each other.Conflict, so we need to be able to effectively isolate each subnet, this is the bridge. The name is also very vivid. It is at the data link layer, while the hub is at the physical layer. Therefore, it can effectively control the broadcast communication in only one part, and the part and the part are connected by a bridge.

Principles of Bridges

Now we will introduce how the bridge resolves broadcast conflicts. The bridge has only two ports.The network connecting the two ports is divided into two subnets, A and B.The bridge will maintain a table for each subnet.At the beginning, the table is empty.Data packet sent by B subnet, and unpack the MAC header to obtain the source MAC address, record it in the corresponding table, and forward it to another subnet. After working for a period of time, almost all the MAC addresses of the machines in the A and B subnets can be recorded. At this time, assuming that the bridge receives the data packet of the A subnet, it will still disassemble the MAC header to view the MAC address of the receiving end. If it is found that the MAC address has been recorded in the A table, it means that it does not need to be broadcast to the B subnet, and it can be solved in the A subnet.The gateway will discard the data packet.If there is no interface MAC address in the A table, it will be forwarded to B Subnet, then check the source MAC address, if it does not exist, continue to add it to the A table. At this point, the problem of broadcast conflicts between subnets after the hub reshaping and expanding the data packet has been completely solved.

In the actual environment, there may not necessarily be two sub-tables inside the bridge, or they may be collected together, depending on the internal implementation decision.

switch

The bridge divides a local area network into two, which also solves the problem of broadcast conflicts, but the wheel of history is always moving forward.Because the bridge is the broadcast communication of the data link layer, when A and B communicate, C and D There is no way to communicate. It is like a small bridge with limited load and cannot allow multiple people to pass together. In order to achieve many-to-many communication, a multi-port bridge was born, which is a switch.

Electric signal and switch

We are back on the right track, the network card adds the start character and FCS to the binary data according to the Ethernet protocol, and converts them into electrical signals for transmission.

After the electrical signal reaches the switch network cable interface through the network cable, the module in the switch will convert the electrical signal into a digital signal after receiving it.The digital signal indicates that the information parameter is more continuous within a given range, rather than discrete. The opposite is analog signals.

An important role of the switch is to ensure that the data packet can be forwarded to the destination as it is. He will disassemble the Ethernet header to get the FCS check error.If the data is ok, it will enter the switch buffer.The following part is basically the same as the previous network card concept, but the working method is different from the network card, because the ROM of the network card has the MAC address , And the switch does not. Instead, the switch maintains a MAC address table. The address table mainly contains two pieces of information:

  • Record the information of the receiver's MAC address

  • Record which port of the switch the receiver's device is connected to.

file

Careful students should have discovered that this part is very similar to the bridge, except that the bridge has only two ports.By disassembling the table to record, it is not necessary to record the port location information. If the current packet matches the MAC address recorded in the MAC table, it will be forwarded directly to the corresponding port. If the specified MAC address cannot be found, it is likely that the device behind this address has not sent packets to the switch, or the switch has deleted it from the address table because it has not been working continuously. At this time, it can only be sent to all ports like broadcast.As mentioned earlier, in the same Ethernet, it is sent to all devices in the entire network in the form of broadcast at the beginning of the design.Only the receiver will receive the packet.The device will ignore it. After the receiver returns a response, the switch will record its MAC address.

Except for the unrecorded MAC address, it will be forwarded to all ports except the source port.If the receiving address meets the broadcast address, the same behavior will be triggered.Common broadcast addresses are:

  • FF of MAC address:FF:FF:FF:FF:FF

  • 255.255.255.255 of the IP address

Gateway

As mentioned in the previous article about the routing table, if there is no matching to the default gateway, the gateway may not be needed, so we assume that the previous receiver IP address matches the default gateway.

The default gateway is generally another name for the router.When it arrives at the router, it can also be compared to a checkpoint on a highway, and the data packet is ready to leave the subnet. In the following, we refer to the gateway as the router.

Routers are also called Layer 3 network devices.Each port of the router has a MAC address and an IP address. So it can be used as the sending and receiving end of the Ethernet.From this point of view, it is the same as the network card. Let's take a look at the workflow of the router:

  • The router will disassemble the Ethernet header, verify the FCS check, if there is no problem, go to the next step

  • Next, disassemble the MAC header to check whether the receiver's MAC address is his own, or discard the packet by himself

  • If it is a packet sent to yourself, the task of this MAC header is completely completed, and the MAC header is completely deleted. Continue to disassemble the IP header and read the IP address.

  • Then query its own routing table, which is the same as the IP layer query routing table operation, first verify the subnet mask, and then look at the specific subnet IP

  • If the gateway is empty, it means that the corresponding IP address is the destination address and the destination has been reached.

  • If there is no matching subnet in the routing table, it means that the destination has not been reached, and continue to forward the data packet to the router's default gateway

Recurse to the opposite end

The router will obtain the MAC address through ARP according to the queried default gateway IP, and it also has an ARP cache.After the MAC address is queried, it adds a MAC header to the data packet, and then adds the Ethernet header to the data packet and forwards it to others through the port.Gateway. Although the router reads the target IP of the IP packet, the IP addresses of the sender and receiver will never be changed.

After forwarding to other gateways, these steps will be recursively performed, and the gateway-to-gateway transfer will be performed until the peer IP is reached.

After reaching the receiving end, the Ethernet header, MAC header, IP header, and TCP header will be sequentially removed, and finally the HTTP information will be read. This test.comsuccessfully received the get request, the resource file is sent to us.

The HTTP process of the server saw that the original request was to access a page, so it encapsulated the web page file in an HTTP response message.
The HTTP response message also needs to wear TCP, IP, and MAC headers, but this time the source address is the server IP address and the destination address is the client IP address.
After putting all kinds of headers, it is sent out again from the network card, only the switch forwards to the gateway route, the router sends the response data packet to the next router, and then the recursive process, until it jumps to the client's router, the router picks up the IP header to find It is indeed the information for this subnet, so the packet is sent to the subnet switch, and then forwarded by the switch to the sender at the beginning.
The transmitting side OSafter receiving a response packet server, various head removed, to get the last HTTP response packet by IPCthe packet to Network Service.

Network ServiceAfter receiving the message, judge the response status code.Remember our initial access address? test.comTherefore, the return 301status code, we can curl -I test.comsee:

HTTP/1.1 301 Moved PermanentlyServer: nginx/1.18.0Date: Tue, 07 Sep 2021 03:21:49 GMTContent-Type: text/html; charset=UTF-8Connection: keep-aliveKeep-Alive: timeout=20X-DIS-Request-ID: 241ea10b621b0644e9844c0f52ef76e1Location: http://www.test.com/

At this point Network Servicewill automatically build a new request, the request of the target response packet location, then return to this article link building request, re-take over the entire process, it is worth noting that a modern browser will default on connection: keep-aliveestablished before it can reuse TCP link to speed up the request speed. Until Network Servicea response is received again.

After receiving the response this time, the status code is normal, and then check the response header content-type. It is MIMEa subset. If the content cannot be parsed, the browser will start an automatic download, if so text/html, it will officially enter the compilation chapter. You can curl -i [https://www.test.com/](https://www.test.com/)see:

HTTP/1.1 200Server: nginx/1.18.0Date: Tue, 07 Sep 2021 03:39:19 GMTContent-Type: text/htmlContent-Length: 8859Connection: keep-aliveKeep-Alive: timeout=20ETag: "5e53086c-229b"X-DIS-Request-ID: c821b8e4044843e8855e76558a610532Set-Cookie: dis-request-id=c821b8e4044843e8855e76558a610532; secureSet-Cookie: dis-timestamp=2021-09-06T20:39:19-07:00; secureSet-Cookie: dis-remote-addr=61.175.192.50; secureX-Frame-Options: sameorigin<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" />...

Finally, if the client wants to leave, it initiates a TCP wave four times to the server, and the connection between the two parties is now disconnected.

After reaching the end, the author omitted the two-step handshake, QUIC and TLS verification after the first TCP arrival. They are completed before the first HTTP response.

The navigation of the network is the process from inputting the url to finally obtaining the file. Which involves a series of knowledge of browser architecture, operating system, network and so on. This article will discuss this process in detail from various angles, involving breadth and depth. If you are a classmate who already has a certain foundation, then this article can quickly lead you to systematically organize fragmentation knowledge.


Tags

Technical otaku

Sought technology together

Related Topic

1 Comments

author

oral lipitor 20mg & lt;a href="https://lipiws.top/"& gt;generic lipitor 40mg& lt;/a& gt; buy atorvastatin 20mg

Lwfldh

2024-03-07

Leave a Reply

+