doubleclick cookie、动态脚本、用户画像、用户行为分析和海量数据存取 推荐词 京东 电商 信息上传 黑洞 https://blackhole.m.jd.com/getinfo

doubleclick cookie 

https://mp.weixin.qq.com/s/vZUj-Z9FGSSWXOodGqbYkA

揭密Google的网络广告技术:基于互联网大数据视角

 相信每个人在上网时都被各种网络广告所困扰,不断地消耗着我们的流量。如果稍微细心观察,或许会发现不同网站推送过来的广告也比较适合自己的偏好,看来其中的技术手段并非简单之事。涉及到互联网大数据技术包括:cookie、动态脚本、用户画像、用户行为分析和海量数据存取等。

      假如你在京东上点击笔记本电脑,过几天以后当你浏览一个从未访问过的网站时,你很可能发现页面上竟然有笔记本的广告。

图1 

       作为一个互联网大数据技术研究者,本能反应当然是看看页面的源代码,确实可以找到相应的脚本,其中的“-ad-”大概表明了这里嵌入了广告。

 

图2

        但由于是动态脚本,无法看出广告具体在哪个网站上。为此,可以通过浏览器的设置功能,进入开发者模式(Source),找到广告条对应的脚本结构。

图3 

        然后查看这段动态脚本执行完成后对应的URL,从下图可以看出这个广告URL指向了googleads.g.doubleclick.net,从域名看就是google的广告。

图4

        没错,doubleclick是一家互联网广告公司,在2008年被Google收购。它提供了多种广告管理和广告投放解决方案,帮助企业购买、制作或销售在线广告,允许用户对网络广告活动进行集中策划、执行、监控和追踪。由此我们可以画出Google的网络广告技术平台架构图。

 

图5

整个流程按图中标注的序号1-5。

1 需要做广告的客户到doubleclick上进行注册、登记;

2 加入广告联盟的网站从doubleclick获得嵌入广告的动态脚本,即类似于图2所示。并将这些代码嵌入到页面中;

3 互联网用户大众通过浏览器访问页面,动态脚本在用户浏览器上执行,获得指向doubleclick的URL;

4 连接doubleclick时,doubleclick生成用户的唯一标识,并写入到本地cookie文件;

5 以后我们每次访问含有广告脚本的页面时,自动读取doubleclick的cookie,并由doubleclick抽取合适的广告。这样每个人的唯一身份就记录到它的数据库中了。而这个步骤,显然是基于我们点击广告、浏览页面的行为数据,是一个海量数据。精准的广告推送需要进行大数据挖掘、用户画像。

       在这个流程中,cookie起到了很大作用,在每台电脑上几乎都有doubleclick的cookie文件。对于win7下的IE,一般是在C:UsersAdministratorAppDataLocalMicrosoftWindowsTemporary Internet Files中;Chrome浏览器可以Chrome设置->隐私设置->内容设置。找到后可以清除。

    1. Request URL:
      https://blackhole.m.jd.com/getinfo
    2. Request Method:
      POST
    3. Status Code:
      200 OK
    4. Remote Address:
      124.200.54.26:443
    5. Referrer Policy:
      no-referrer-when-downgrade
  1. Response Headers
    1. Access-Control-Allow-Origin:
      *
    2. Connection:
      keep-alive
    3. Content-Length:
      95
    4. Content-Type:
      text/plain
    5. Date:
      Sun, 28 Apr 2019 02:17:39 GMT
    6. Server:
      jfe
  2. Request Headers
    1. Provisional headers are shown
    2. Content-Type:
      application/x-www-form-urlencoded
    3. Origin:
      https://www.jd.com
    4. Referer:
      https://www.jd.com/
    5. User-Agent:
      Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36
  3. Form Dataview sourceview URL encoded
    1. body:
      {"appname":"jdwebm_hf","jdkey":"","whwswswws":"","businness":"pcHome","body":{"browser_info":"5a6de6eb4239d72a591cd732fcf557bc","client_time":1556417862694,"period":24,"shshshfpa":"93048091-c96c-4aff-40ec-0c8bb237d983-1556417862","whwswswws":"","cookie_pin":"","jdu":"1556417860906768591982","mba_muid":"","visitkey":"","msdk_version":"2.3.4","wid":"","language":"en-US","color_depth":24,"pixel_ratio":1,"resolution":"1280;800","available_resolution":"1227;800","session_storage":1,"local_storage":1,"indexed_db":1,"open_database":1,"cpu_class":"unknown","navigator_platform":"Win32","regular_plugins":"Chrome PDF Plugin::Portable Document Format::application/x-google-chrome-pdf~pdf;Chrome PDF Viewer::::application/pdf~pdf;Native Client::::application/x-nacl~,application/x-pnacl~","adblock":false,"touch_support":0,"app_code_name":"Mozilla","app_name":"Netscape","app_version":"5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36","cookie_enabled":true,"regular_mimetypes":"::::;::Portable Document Format::;::Native Client Executable::;::Portable Native Client Executable::","online":"unknown","hardwareConcurrency":2,"product":"Gecko","productSub":"20030107","vendor":"Google Inc.","vendorSub":"unknown","devicePixelRatio":1,"updateInterval":"unknown","orientationType":"landscape-primary","doNotTrack":0,"canvas":"canvas winding:yes~canvas fp:231bea7a22d38c7771b1fd991affdfc6","webgl":"fp:cba0abf4b20cd68cd9fb42b1524d0708~extensions:ANGLE_instanced_arrays;EXT_blend_minmax;EXT_color_buffer_half_float;EXT_frag_depth;EXT_shader_texture_lod;EXT_texture_filter_anisotropic;WEBKIT_EXT_texture_filter_anisotropic;OES_element_index_uint;OES_standard_derivatives;OES_texture_float;OES_texture_half_float;OES_texture_half_float_linear;OES_vertex_array_object;WEBGL_color_buffer_float;WEBGL_compressed_texture_s3tc;WEBKIT_WEBGL_compressed_texture_s3tc;WEBGL_debug_renderer_info;WEBGL_debug_shaders;WEBGL_depth_texture;WEBKIT_WEBGL_depth_texture;WEBGL_lose_context;WEBKIT_WEBGL_lose_context~aliased line width range:[1, 1]~aliased point size range:[1, 256]~alpha bits:8~antialiasing:yes~blue bits:8~depth bits:24~green bits:8~max anisotropy:16~max combined texture image units:20~max cube map texture size:4096~max fragment uniform vectors:221~max render buffer size:4096~max texture image units:16~max texture size:4096~max varying vectors:9~max vertex attribs:16~max vertex texture image units:4~max vertex uniform vectors:253~max viewport dims:[4096, 4096]~red bits:8~renderer:WebKit WebGL~shading language version:WebGL GLSL ES 1.0 (OpenGL ES GLSL ES 1.0 Chromium)~stencil bits:0~vendor:WebKit~version:WebGL 1.0 (OpenGL ES 2.0 Chromium)~unmasked vendor:Google Inc.~unmasked renderer:ANGLE (Mobile Intel(R) 4 Series Express Chipset Family Direct3D9Ex vs_3_0 ps_3_0)~vertex high float:23(127,127)~vertex medium float:23(127,127)~vertex low float:23(127,127)~fragment high float:23(127,127)~fragment medium float:23(127,127)~fragment low float:23(127,127)~vertex high int:0(24,24)~vertex medium int:0(24,24)~vertex low int:0(24,24)~fragment high int:0(24,24)~fragment medium int:0(24,24)~fragment low int:0(24,24)","device_memory":8,"is_headless_browser":0}}

 在保存有uuid情况下

Request URL:https://floor.jd.com/user/hotwords/get?pin=&uuid=1550390246822668439123&callback=jsonpHotWords
Request Method:GET
Status Code:200 OK
Remote Address:211.144.24.170:443
Referrer Policy:no-referrer-when-downgrade
Response Headers
view source
Connection:close
Content-Encoding:gzip
Content-Type:text/html; charset=utf-8
Date:Sun, 28 Apr 2019 02:31:52 GMT
Server:jfe
Transfer-Encoding:chunked
Vary:Accept-Encoding
Request Headers
view source
Accept:*/*
Accept-Encoding:gzip, deflate, sdch, br
Accept-Language:zh-CN,zh;q=0.8
Connection:keep-alive
Cookie:shshshfpa=e27a5d69-e1c0-282d-ea23-9695b1e69510-1550390256; TrackID=1SQI5uj2G1r220dR6ifodPmI8KRO5dZs3OmsiX1SfPCYPCDefRrnEfXWxtXJXAoVZxMHISD56FXkht7-BTmb0iK9S9AT1-UppuX4Q7Pf0u1M; pinId=aR72BDnyHaxs0LHNbV6fLg; __jdv=122270672|direct|-|none|-|1556382041032; areaId=11; ipLoc-djd=11-799-0; PCSYCityID=1137; __jda=122270672.1550390246822668439123.1550390247.1556382041.1556417842.9; __jdb=122270672.3.1550390246822668439123|9.1556417842; __jdc=122270672; shshshfp=8e7ec1c7e67d1ed4944e451a3574168a; shshshsID=6c6284d43d4431f049e3a6f3152e5d03_3_1556418662038; shshshfpb=z8d2uPw45jLHaI6jyaglJIw%3D%3D; __jdu=1550390246822668439123
Host:floor.jd.com
Referer:https://www.jd.com/
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0
Query String Parameters
view source
view URL encoded
pin:
uuid:1550390246822668439123
callback:jsonpHotWords

 
 
 
 
原文地址:https://www.cnblogs.com/rsapaper/p/10782334.html