URL解析

参考博文：【基础进阶】URL详解与URL编码

一、URI vs URL

URI：(Uniform Resource Identifier 统一资源标识符)。

URL：(Uniform/Universal Resource Locator 统一资源定位符)。

关系：

- URI 属于 URL 更低层次的抽象，一种字符串文本标准。
- 就是说，URI 属于父类，而 URL 属于 URI 的子类。URL 是 URI 的一个子集。
- 二者的区别在于，URI 表示请求服务器的路径，定义这么一个资源。而 URL 同时说明要如何访问这个资源（http://）。

二、URL

URL(Uniform Resource Locator 统一资源定位器) 地址用于描述一个网络上的资源
基本格式

protocol://hostname[:port]/pathname[;url-params][?search][#hash]
- - - protocol：指定低层使用的协议

protocol	访问	用于...
http	超文本传输协议	以 http:// 开头的普通网页。不加密。
https	安全超文本传输协议	安全网页，加密所有信息交换。
ftp	文件传输协议	用于将文件下载或上传至网站。
file		您计算机上的文件。

（host=hostname+port）

- - hostname：域名
  - port：HTTP服务器的默认端口是80（可以省略）。如果使用了别的端口，必须指明，例如：http://www.cnblogs.com:8080/
  - pathname：访问资源的路径（包括文件）
  - url-params：所带参数
  - search：发送给http服务器的数据
  - hash：锚

3. 例子：

http://www.myw-ebsite.com/sj/test;id=8079?name=sviergn&x=true#stuff

Schema: http

host.domain: www.mywebsite.com

path: /sj/test

URL params: id=8079

Query String: name=sviergn&x=true

Anchor: stuff

三、解析URL

目的：从URL中提取需要的元素，如host、请求参数等
通常做法：正则匹配相应字段
巧妙方法：动态创建一个HTMLAnchorElement对象（a标签）或 URL对象，利用HTMLAnchorElement对象或 URL对象的属性（详见URL相关Web APIs），再加上一些正则（见正则表达式）

function parseURL(url) {
    //方法一：动态创建a标签（HTMLAnchorElement对象）
    var a = document.createElement('a');
    a.href = url;
    //方法二：动态创建URL对象：var a = new URL (url);

    return {
        source: url,
        protocol: a.protocol.replace(':',''),
        hostname: a.hostname,
        port: a.port,
        pathname: a.pathname,
        segments: a.pathname.replace(/^//,'').split('/'),//先把pathname开头的/去掉，再把剩余的根据/进行分割
        file: (a.pathname.match(/([^/?#]+)$/i) || [,''])[1],//若pathname末尾包含不带/?#的捕获组，则其为filename；否则，filename为空字符串
        search: a.search,
        params: (function(){
            var ret = {};
            var seg = a.search.replace(/^?/,'').split('&');
            var len = seg.length;
            for (var i = 0;i<len;i++) {
                if (!seg[i]) { continue; }
                var s = seg[i].split('=');
                ret[s[0]] = s[1];
            }
            return ret;
        })(),
        hash: a.hash.replace('#','')
    };
}

使用方法：

var myURL = parseURL('http://abc.com:8080/dir/index.html?id=255&m=hello#top');

myURL.source; // 'http://abc.com:8080/dir/index.html?id=255&m=hello#top'

myURL.protocol; // 'http'

myURL.hostname; // 'abc.com'

myURL.port; // '8080'

myURL.pathname; // '/dir/index.html'

myURL.segments; // Array = ['dir', 'index.html']

myURL.file; // 'index.html'

myURL.search; // '?id=255&m=hello'

myURL.params; // Object = { id: 255, m: hello }

myURL.hash; // 'top'