Fragmented IP packet forwarding ip分片

  https://rtodto.net/fragmented-ip-packet-forwarding/

IP分片只有第一个带有传输层或ICMP首部,其余的分片只有IP头。

分片报文的有效长度是8的倍数

分片需要解决的问题主要有两个:第一,如何判断是否需要分片(若报文的长度大于1500字节且在分片标志上又允许分片,则需要分片)。第二,在分片时都需要做些什么事?如果不允许分片,那么IP层就直接把数据包丢弃,同时,发送一个ICMP的错误回应报文给源端。

[root@bogon scapy]# cat frag.py 
#!/usr/bin/python

from scapy.all import *
sip="10.10.103.81"
dip="10.10.103.229"
payload="A"*496+"B"*500
packet=IP(src=sip,dst=dip,id=12345)/UDP(sport=1500,dport=1501)/payload

frags=fragment(packet,fragsize=500)
counter=1
for fragment in frags:
    print "Packet no#"+str(counter)
    print "==================================================="
    fragment.show() #displays each fragment
    counter+=1
    send(fragment)
[root@bogon scapy]# python frag.py
Packet no#1
===================================================
###[ IP ]### 
  version   = 4
  ihl       = None
  tos       = 0x0
  len       = None
  id        = 12345
  flags     = MF
  frag      = 0
  ttl       = 64
  proto     = udp
  chksum    = None
  src       = 10.10.103.81
  dst       = 10.10.103.229
  options   
###[ Raw ]### 
     load      = "x05xdcx05xddx03xecx1d'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

.
Sent 1 packets.
Packet no#2
===================================================
###[ IP ]### 
  version   = 4
  ihl       = None
  tos       = 0x0
  len       = None
  id        = 12345
  flags     = 
  frag      = 63
  ttl       = 64
  proto     = udp
  chksum    = None
  src       = 10.10.103.81
  dst       = 10.10.103.229
  options   
###[ Raw ]### 
     load      = 'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB'

.
Sent 1 packets.
[root@bogon scapy]# 
[root@bogon ~]# tcpdump -i enahisic2i3 udp  and  host 10.10.103.229 -env
tcpdump: listening on enahisic2i3, link-type EN10MB (Ethernet), capture size 262144 bytes
09:33:51.063862 48:57:02:64:ea:1e > Broadcast, ethertype IPv4 (0x0800), length 538: (tos 0x0, ttl 64, id 12345, offset 0, flags [+], proto UDP (17), length 524)
    10.10.103.81.vlsi-lm > 10.10.103.229.saiscm: UDP, bad length 996 > 496
09:33:53.323477 48:57:02:64:ea:1e > Broadcast, ethertype IPv4 (0x0800), length 534: (tos 0x0, ttl 64, id 12345, offset 504, flags [none], proto UDP (17), length 520)
    10.10.103.81 > 10.10.103.229: ip-proto-17
[root@bogon scapy]# netstat -i
Kernel Interface table
Iface             MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
br1              1450      647      0      0 0             9      0      0      0 BMRU
brq38c0d85e-bd   1500     2283      0      0 0             7      0      0      0 BMRU
brqf1411bad-10   1500   377062      0      0 0        410925      0      0      0 BMRU
enah2i3.1022     1500   130391      0      0 0           186      0      0      0 BMRU
enah2i3.1030     1500   187599      0      0 0           708      0      0      0 BMRU
enahisic2i0      1500 51252624      0 2558626 0       9121332      0      0      0 BMRU
enahisic2i1      1500 51019017      0 5601611 0          1453      0      0      0 BMRU
enahisic2i2      1500 28739907      0 211542 0           127      0      0      0 BMRU
enahisic2i3      1500 512773286      0 7104854 0       2347674      0      0      0 BMRU
enahisic2i3.222  1500      334      0      0 0           369      0      0      0 BMRU
enahisic2i3.310  1500        0      0      0 0             0      0      0      0 BMRU
enahisic2i3.900  1500 51252624      0 2558626 0       9121335      0      0      0 BMRU
lo              65536 984375886      0      0 0      984375886      0      0      0 LRU
tapae492383-36   1500      334      0      0 0           501      0      0      0 BMRU
tapb28a1d0d-a4   1500     4762      0      0 0        200041      0      0      0 BMRU
tapebc6bb55-29   1500        5      0      0 0             9      0      0      0 BMRU
veth1            1500      334      0      0 0           369      0      0      0 BMRU
virbr0           1500        0      0      0 0             0      0      0      0 BMU
vxlan100         1450      353      0      0 0           350      0      0      0 BMRU
[root@bogon scapy]# 

换个存在的dst ip

[root@bogon scapy]# ping 10.10.103.82
PING 10.10.103.82 (10.10.103.82) 56(84) bytes of data.
64 bytes from 10.10.103.82: icmp_seq=1 ttl=64 time=0.072 ms
64 bytes from 10.10.103.82: icmp_seq=2 ttl=64 time=0.067 ms
^C
--- 10.10.103.82 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1064ms
rtt min/avg/max/mdev = 0.067/0.069/0.072/0.008 ms
[root@bogon scapy]# cat frag.py 
#!/usr/bin/python

from scapy.all import *
sip="10.10.103.81"
dip="10.10.103.82"
payload="A"*496+"B"*500
packet=IP(src=sip,dst=dip,id=12345)/UDP(sport=1500,dport=1501)/payload

frags=fragment(packet,fragsize=500)
counter=1
for fragment in frags:
    print "Packet no#"+str(counter)
    print "==================================================="
    fragment.show() #displays each fragment
    counter+=1
    send(fragment)
[root@bogon scapy]# 
[root@bogon scapy]# cat frag.py 
#!/usr/bin/python

from scapy.all import *
sip="10.10.103.81"
dip="10.10.103.82"
payload="A"*496+"B"*500
packet=IP(src=sip,dst=dip,id=12345)/UDP(sport=1500,dport=1501)/payload

frags=fragment(packet,fragsize=500)
counter=1
for fragment in frags:
    print "Packet no#"+str(counter)
    print "==================================================="
    fragment.show() #displays each fragment
    counter+=1
    send(fragment)
[root@bogon scapy]# 

[root@bogon ~]# tcpdump -i enahisic2i3  host 10.10.103.82 -teennvv
tcpdump: listening on enahisic2i3, link-type EN10MB (Ethernet), capture size 262144 bytes
48:57:02:64:ea:1e > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.103.82 tell 10.10.103.81, length 28
48:57:02:64:e7:ae > 48:57:02:64:ea:1e, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.10.103.82 is-at 48:57:02:64:e7:ae, length 46
48:57:02:64:ea:1e > 48:57:02:64:e7:ae, ethertype IPv4 (0x0800), length 538: (tos 0x0, ttl 64, id 12345, offset 0, flags [+], proto UDP (17), length 524)
    10.10.103.81.1500 > 10.10.103.82.1501: UDP, bad length 996 > 496
48:57:02:64:ea:1e > 48:57:02:64:e7:ae, ethertype IPv4 (0x0800), length 534: (tos 0x0, ttl 64, id 12345, offset 504, flags [none], proto UDP (17), length 520)
    10.10.103.81 > 10.10.103.82: ip-proto-17
48:57:02:64:e7:ae > 48:57:02:64:ea:1e, ethertype IPv4 (0x0800), length 590: (tos 0xc0, ttl 64, id 4394, offset 0, flags [none], proto ICMP (1), length 576)
    10.10.103.82 > 10.10.103.81: ICMP 10.10.103.82 udp port 1501 unreachable, length 556
        (tos 0x0, ttl 64, id 12345, offset 0, flags [none], proto UDP (17), length 1024)
    10.10.103.81.1500 > 10.10.103.82.1501: UDP, length 996

      先reassemble再提交给4层,然后发现port unreachable

Let me explain each line what happens here.

1st Packet

IP (tos 0x0, ttl 64, id 12345, offset 0, flags [+], proto UDP (17), length 524)
    144.2.3.2.1500 > 173.63.1.2.1501: UDP, length 996

We pushed 496A+500B bytes of payload of data to scapy. Dear scapy took 496bytes of this data which is all A characters and encapsulated with 8  bytes of UDP header + 20 bytes of IP header which is in total = 524 bytes. Pay attention to the port numbers. Those are the UDP port numbers we set in the code. UDP length shows 996bytes since our payload is this number of bytes in total. ID number is 12345 and it is the same on 1st and 2nd packet. Offset is also 0 as this is the first packet. Although we can’t see on this output, we have also More Fragment bit is on.

2nd packet

IP (tos 0x0, ttl 64, id 12345, offset 504, flags [none], proto UDP (17), length 520)
    144.2.3.2 > 173.63.1.2: ip-proto-17

Real fun begins here. Where are the port numbers? We don’t have them on the second packet as the UDP header is on the first packet. You can see this from the packet size. 500bytes(B) payload + 20 bytes IP header i.e no room for header. The evidence of fragmentation is the offset but why is it 504? This field specifies how far we are from the beginning of the unfragmented IP packet and I believe it counts UDP header too:) so our offset should be 496A + 8bytes UDP header = 504.

Note: If you display the same packets in Wireshark, due to the default setting “Reassemble fragmented IPv4 datagrams“, it misleads you to think that UDP header is on the second packet instead of the first one. Be careful!

IP协议协议--IP分片

如图,当IP数据报超过帧的MTU(最大传输单元)时,它将会被分片传输。分片能发生在发送端或者中转路由器,且在传输过程中可能被多次分片。在最后的目标机器上这些分片才会被内核的的IP模块重新组装。

  在IPv4的头部信息中有3个字段专门为IP分片服务的:
这里写图片描述

  一个IP数据报的每个分片都具有自己的IP头部信息,它们都具有相同的标识值,但是具有不同的位偏移,且除了最后一个分片外,其他分片都将设置MF标志。此外,每个分片的IP头部的总长度字段将被设置为该分片的长度。

  以太网帧的MTU是1500字节,因此它的数据部分最大为1480字节(IP头部占用20字节)。为观察IP分片的数据报,这里采用ICMP协议发送一个长度为1501字节的IP数据报,其中IP头部占用20字节,ICMP报文占据1481字节。1481字节的ICMP数据报中含8字节ICMP头部,其他1473字节为数据部分。长度为1504的IP数据报被拆分为2个IP分片:

  (1) 第1个IP分片:1480字节ICMP数据报文(含8字节的ICMP头部信息) + 20 字节IP头部信息 = 1500字节的IP数据报,设置了MF位
  (2) 第2个IP分片:1字节的ICMP数据报文(不含8字节的ICMP头部信息) + 20 字节的IP头部信息 = 21字节的IP数据包,没有设置MF位  --------------没有icmp头部

  用户要发送的以太网帧:
这里写图片描述

  分片后:
这里写图片描述
  注:分片1的IP头部信息的MF(More Fragment)位被置为1,分片2的该位不设置,即为0

  事实上,ICMP报文的头部长度取决于ICMP报文的类型且变化范围很大。这里以8字节为例,因为接下来的用例用到了ping程序,ping应用程序使用的ICMP回显和应答报文的头部长度都是8字节

2. 用tcpdump命令观察IP分片

机器1:Ubuntu14.04 IP地址为192.168.239.136
机器2:Ubuntu11.04 IP地址为192.168.239.145

  用机器1来ping机器1,每次传送1473字节的数据(这是ICMP的数据部分)以引起IP分片,用tcpdump抓取过程中双方交换的数据包:

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
1. IP (tos 0x0, ttl 64, id 52457, offset 0, flags [+], proto ICMP (1), length 1500)
    192.168.239.136 > 192.168.239.145: ICMP echo request, id 4938, seq 1, length 1480
2. IP (tos 0x0, ttl 64, id 52457, offset 1480, flags [none], proto ICMP (1), length 21)
    192.168.239.136 > 192.168.239.145: ip-proto-1

3. IP (tos 0x0, ttl 64, id 36694, offset 0, flags [+], proto ICMP (1), length 1500)
    192.168.239.145 > 192.168.239.136: ICMP echo reply, id 4938, seq 1, length 1480
4. IP (tos 0x0, ttl 64, id 36694, offset 1480, flags [none], proto ICMP (1), length 21)
    192.168.239.145 > 192.168.239.136: ip-proto-1

  一条ping命令对应两个IP数据报,以第一条ping语句的IP数据报为例:
  (1) 两个IP分片的标识值52457,说明它们是同一个IP数据报的分片
  (2) 第1个分片的位偏移为0,第2个为1480。显然,第2个分片的位偏移实际上就是第一个分片的ICMP报文的长度
  (3) 第1个分片中“flags[+]”即设置了MF标志,表示还有后续分片;第2个分片中“flags[none]”即没有设置MF位
  (4) 两个IP分片的长度分别1500字节和21字节

原文地址:https://www.cnblogs.com/dream397/p/13710601.html