打印上一主题下一主题利用cURL实现单个文件分多段同时下载，支持断点续传(修订版)

利用cURL实现单个文件分多段同时下载，支持断点续传(修订版) [复制链接]

摘自 http://bbs.chinaunix.net/thread-917952-1-1.html

在ubuntu下测试通过, 适合在支持多线程下载的站点下载文件
可以配合flashgot在firefox中使用

用法：./mycurl url [referedUrl]
第一个参数url是要下载的文件的地址,第二个参数referedUrl是指需要参照的网址(一般不需要,有些网站,比如华军需要此参数)
例如:
./mycurl ftp://xx.xxx.xxx/xxx.rar
或者
./mycurl http://xx.xxx.xx/xxx.rar http://www.xxx.xxx/yy.htm

下面是代码：

#!/bin/bash
####################################################################
#
# Script for curl to support resumable multi-part download.
#
# Tested on Ubuntu
#

url=$1

# How many "parts" will the target file be divided into?
declare -i parts=5

read -ep "Please input the target directory: " targetdir
read -ep "Please input the outfile name: " outfile

[ -z "$targetdir" ] && targetdir="./"
cd $targetdir||exit 2

[ -z "$outfile" ] && outfile=`basename $1`

#Set the referer url
if [ -n "$2" ]; then
refurl="-L -e $2"
else refurl=""
fi

length=`curl $refurl -s -I $url|grep Content-Length|tail -n 1|sed s/[^0-9]//g`

if [ -z "$length" ]; then
      echo "cann't get the length of the target file"
      exit 1
fi
let "length = $length"

#lsession is used to record how many bytes of each subpart should be downloaded
declare -i lsession=$(($length/$parts))

finished="false"

#Assume the available maximum connections on server can reach "parts" at first
maxconn=$parts

while true;
do

for (( i=1; i<=parts ; i=i+1 ))
do

#Array offsetold is used to record how many bytes have been downloaded of each subpart

if [ -e $outfile$i ]; then
            offsetold[$i]=`ls -l $outfile$i|awk '{print $5}'`
      else offsetold[$i]=0
      fi
      let "offsetold[$i] = ${offsetold[$i]}"

done

curr=0

for (( i=1; i<=parts && maxconn>0; i=i+1 ))
do

      if [ $i -lt $parts ]; then
            if [ ${offsetold[$i]} -lt $lsession ]; then
                     curl $refurl -r $(($curr+${offsetold[$i]}))-$(($curr+$lsession-1)) $url >> $outfile$i &
         maxconn=$(($maxconn-1))
            fi
      else
            if [ ${offsetold[$i]} -lt $(($length-$(($lsession*$(($parts-1)))))) ]; then
                     curl $refurl -r $(($curr+${offsetold[$i]}))- $url >> $outfile$i &
         maxconn=$(($maxconn-1))
            fi
      fi

      curr=$(($curr+$lsession))

done

#To wait for all curl processes to terminate.

wait

finished="true"
maxconn=0
for (( i=1; i<=parts; i=i+1 ))
do

#Array offsetnew is used to record how many bytes have been downloaded of each subpart

      if [ -e $outfile$i ]; then
            offsetnew[$i]=`ls -l $outfile$i|awk '{print $5}'`
      else offsetnew[$i]=0
      fi
      let "offsetnew[$i] = ${offsetnew[$i]}"
      if [ $i -lt $parts ]; then
            if [ ${offsetnew[$i]} -lt $lsession ]; then
                     finished="false"
            fi
      else
            if [ ${offsetnew[$i]} -lt $(($length-$(($lsession*$(($parts-1)))))) ]; then
                     finished="false"
            fi
      fi

#Calculate the "real" available maximum connections supported by server

if [ ${offsetnew[$i]} -gt ${offsetold[$i]} ]; then
      maxconn=$(($maxconn+1))
fi
done

      if [ "$finished" == "true" ]; then
            break
elif [ $maxconn -eq 0 ]; then
      echo "Some errors may occur. retry 10 sec later..."
      sleep 10
      maxconn=parts
      fi
done

echo "All parts have been downloaded. Merging..."

mv --backup=t $outfile"1" $outfile
for (( i=2; i<=parts; i=i+1))
do
      cat $outfile$i >> $outfile
      rm $outfile$i
done

echo "Done."

[ 本帖最后由 ypxing 于 2007-4-4 21:45 编辑 ]

如何计算一个单词出现的次数 Linux相关书籍(2015-07-10:更新ksh) shell 十三問? awk初学之常见问题【5楼解答的很详细，大家可以讨论下】sed 地址和模式匹配的问题 awk数组的学习心得文本编辑的一点心得--awk篇 shell基础二十篇 Shell 游戏：贪吃蛇 [ SHELL 综合水平测试 ] 钢七连不抛弃,不放弃 http://ypxing.cublog.cn/ linux dhcp peizhi roc \| 关于Unix文件的软链接 \| 求教这个命令什么意思，我是新手，刚刚学习 ... \| sed -e "/grep/d" 是什么意思？谢谢 ... \|

家境小康

Rank: 2

帖子: 1597
主题: 14
精华: 0
可用积分: 1679
信誉积分: 100
专家积分: 0
在线时间: 56 小时
注册时间: 2004-12-13
最后登录: 2011-04-28

论坛徽章:: 0

2楼 [报告]

发表于 2007-04-01 11:01:09 |只看该作者

支持一下，这个脚本实用价值很好（2004年到现在的第一帖……）。

提两个建议：
1. 接收curl的参数，这样才是一个完整的curl壳。
2. 等待本shell脚本所起的后台进程全部完成，用wait，既准确又简洁。
另外，随着您的shell功力提高，您可以自己尝试着修整该脚本，相信也会有所收获。

PS：有个真正的多线程（这个脚本是多进程下载）命令行下载工具axel，据说不错，但我没用过，可以一试。

SUN E4500/SUN F4800/SUN V880
Solaris 8
KSH/NAWK/SED/VIM 6.3.3/perl 5.005_03

编译器和语言的恩怨情仇| 【有奖讨论】DOCKER进化带来的讨论！ | 【有奖调研】中小企业云服务选型 | 华为ICT巡展，跨越350个城市来见你~

家境小康

Rank: 2