49、word2vec

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import word2vec_basic
Found and verified text8.zip
Data size 17005207
Most common words (+UNK) [['UNK', 418391], ('the', 1061396), ('of', 593677), ('and', 416629), ('one', 411764)]
Sample data [5243, 3081, 12, 6, 195, 2, 3135, 46, 59, 156] ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
3081 originated -> 5243 anarchism
3081 originated -> 12 as
12 as -> 3081 originated
12 as -> 6 a
6 a -> 12 as
6 a -> 195 term
195 term -> 2 of
195 term -> 6 a
Initialized
Average loss at step 0 : 275.96685791
Nearest to b: lim, mathbb, pron, sadd, postmodernism, yearning, interim, circumstance,
Nearest to s: astronomers, hallelujah, ona, heiress, sparkling, proverb, rulings, bartle,
Nearest to when: superpower, gaels, cutaway, novarum, ananda, geostationary, panthera, hypocrisy,
Nearest to seven: herbivorous, hyperplasia, kenyatta, ajanta, zadok, eternally, fairness, hine,
Nearest to of: mumbai, guidebook, arlington, phase, slowdown, palomar, hardcover, phonetics,
Nearest to system: getty, instructed, archers, beowulf, empowerment, arrears, grandsons, nicea,
Nearest to its: federico, bins, transducers, stanhope, range, freight, menai, vaduz,
Nearest to called: massed, desertification, doesn, morphology, monasteries, canceled, watering, lumpur,
Nearest to known: bolivia, banzer, humanism, adele, finnic, kwajalein, filtration, putting,
Nearest to will: horrors, fr, analysis, moravians, landslide, parenting, isomer, insulated,
Nearest to people: pathological, anagram, jonas, scenario, intercepts, guru, prequels, kirchhoff,
Nearest to nine: philosophy, dukes, trusting, szabo, contradicting, columba, citation, forks,
Nearest to also: jean, positive, articulated, serious, shepard, rabin, science, supplement,
Nearest to eight: response, amour, hissarlik, badminton, tuscany, heightened, ils, ashamed,
Nearest to are: jovian, provider, supervision, bosom, henslow, gimmicks, acute, burundi,
Nearest to all: robertson, mammoths, shapeshifting, mobilize, wasteful, nearing, kansas, resentment,
Average loss at step 2000 : 113.10768539
Average loss at step 4000 : 52.4376370575
Average loss at step 6000 : 33.8352457421
Average loss at step 8000 : 23.7887491972
Average loss at step 10000 : 18.0617327156
Nearest to b: gland, indians, lim, taliban, tezuka, circumstance, nine, rendering,
Nearest to s: and, condom, the, aarhus, UNK, holmes, of, gland,
Nearest to when: deposits, gland, geostationary, experimental, allowing, were, algebra, hypocrisy,
Nearest to seven: eight, analogue, zero, nine, six, reginae, gland, phi,
Nearest to of: and, in, for, from, with, the, agave, roper,
Nearest to system: psi, instructed, law, empowerment, saskatchewan, archaeology, celebrated, obligation,
Nearest to its: the, agave, their, range, a, kicking, aarhus, established,
Nearest to called: sadler, victoriae, mathbf, experimented, UNK, anthony, monasteries, doesn,
Nearest to known: bob, bolivia, phi, seer, music, helped, convention, humanism,
Nearest to will: fr, horrors, analysis, rfc, skill, situated, vogt, mya,
Nearest to people: reginae, perceived, music, jonas, september, married, pathological, scenario,
Nearest to nine: gland, zero, reginae, eight, gb, victoriae, cl, altenberg,
Nearest to also: jean, zionist, reginae, serious, crispin, probe, supplement, confusing,
Nearest to eight: six, nine, gland, zero, five, seven, reginae, phi,
Nearest to are: is, ba, kramnik, hoax, were, african, analogue, supervision,
Nearest to all: kansas, expanded, asterism, profession, complexity, references, robertson, represents,
Average loss at step 12000 : 13.7994806267
Average loss at step 14000 : 11.7659612741
Average loss at step 16000 : 9.8469510901
Average loss at step 18000 : 8.50730247939
Average loss at step 20000 : 7.85234803987
Nearest to b: lim, and, gland, circumstance, indians, tezuka, nine, pron,
Nearest to s: and, zero, holmes, the, or, birkenau, his, of,
Nearest to when: deposits, were, geostationary, and, gland, experimental, analogue, ananda,
Nearest to seven: eight, nine, zero, five, six, three, two, four,
Nearest to of: in, and, for, with, from, nine, eight, agave,
Nearest to system: psi, instructed, law, archers, UNK, cartier, nicea, empowerment,
Nearest to its: the, their, his, agave, a, absalom, aarhus, range,
Nearest to called: sadler, massed, UNK, monasteries, victoriae, experimented, pair, mathbf,
Nearest to known: dasyprocta, bob, bolivia, injuring, arg, phi, bug, hmong,
Nearest to will: fr, would, rfc, horrors, bosniaks, analysis, emerson, situated,
Nearest to people: reginae, perceived, scenario, odes, music, intercepts, anagram, pathological,
Nearest to nine: eight, six, seven, five, zero, four, dasyprocta, three,
Nearest to also: jean, zionist, which, crispin, amber, reginae, cth, confusing,
Nearest to eight: nine, five, six, zero, seven, three, four, two,
Nearest to are: is, were, was, kramnik, analogue, hoax, in, mathbf,
Nearest to all: expanded, kansas, rhenish, asterism, robertson, complexity, profession, represents,
Average loss at step 22000 : 7.24495614147
Average loss at step 24000 : 7.01978718054
Average loss at step 26000 : 6.66928812242
Average loss at step 28000 : 6.14945300984
Average loss at step 30000 : 6.17055390692
Nearest to b: and, gland, circumstance, lim, d, grants, landscapes, indians,
Nearest to s: and, zero, of, his, the, or, inches, six,
Nearest to when: deposits, speedup, and, geostationary, analogue, gland, experimental, were,
Nearest to seven: nine, eight, five, six, four, three, zero, two,
Nearest to of: in, and, for, from, s, nine, eight, iota,
Nearest to system: psi, empowerment, instructed, archers, cartier, law, nicea, obligation,
Nearest to its: their, the, his, a, agave, absalom, surroundings, amdahl,
Nearest to called: UNK, massed, sadler, primigenius, abitibi, victoriae, experimented, bagapsh,
Nearest to known: dasyprocta, adele, well, bob, used, seer, bolivia, injuring,
Nearest to will: would, fr, could, rfc, emerson, cpa, bosniaks, foam,
Nearest to people: reginae, odes, pathological, music, intercepts, scenario, perceived, guru,
Nearest to nine: eight, six, seven, five, four, three, zero, dasyprocta,
Nearest to also: which, zionist, crispin, sometimes, jean, cth, trinomial, reginae,
Nearest to eight: nine, six, five, seven, four, three, zero, abitibi,
Nearest to are: were, is, analogue, was, have, hoax, kramnik, anoa,
Nearest to all: rhenish, asterism, reuptake, kansas, expanded, dasyprocta, represents, profession,
Average loss at step 32000 : 5.86945372009
Average loss at step 34000 : 5.86404296362
Average loss at step 36000 : 5.67395866251
Average loss at step 38000 : 5.25235128129
Average loss at step 40000 : 5.48230646706
Nearest to b: UNK, circumstance, gland, grants, and, pron, d, landscapes,
Nearest to s: and, his, two, inches, holmes, the, or, birkenau,
Nearest to when: and, four, but, fielder, speedup, geostationary, deposits, were,
Nearest to seven: eight, six, five, four, nine, three, zero, one,
Nearest to of: in, from, for, and, abet, msg, eight, iota,
Nearest to system: psi, empowerment, instructed, cartier, archers, law, conflict, improved,
Nearest to its: their, the, his, a, agave, absalom, her, amdahl,
Nearest to called: UNK, massed, sadler, primigenius, abitibi, christiansen, victoriae, abet,
Nearest to known: used, adele, well, finnic, seer, dasyprocta, bolivia, bob,
Nearest to will: would, could, can, bosniaks, fr, may, rfc, to,
Nearest to people: reginae, odes, pathological, intercepts, music, coquitlam, scenario, perceived,
Nearest to nine: eight, seven, six, zero, five, four, three, dasyprocta,
Nearest to also: which, zionist, sometimes, crispin, generally, trinomial, reginae, cth,
Nearest to eight: nine, six, seven, five, four, zero, three, abitibi,
Nearest to are: were, is, have, was, analogue, absalon, angiotensin, kramnik,
Nearest to all: rhenish, asterism, reuptake, kansas, dasyprocta, many, expanded, any,
Average loss at step 42000 : 5.29408154821
Average loss at step 44000 : 5.32328894198
Average loss at step 46000 : 5.2740817008
Average loss at step 48000 : 5.040927809
Average loss at step 50000 : 5.12989223862
Nearest to b: gland, grants, circumstance, pron, six, d, abitibi, seven,
Nearest to s: zero, inches, his, and, nguni, pottery, recombine, vicarage,
Nearest to when: but, six, four, seven, speedup, deposits, gland, if,
Nearest to seven: eight, six, four, five, nine, three, zero, two,
Nearest to of: in, nine, for, and, from, thibetanus, reuptake, seven,
Nearest to system: psi, empowerment, instructed, cartier, archers, law, improved, conflict,
Nearest to its: their, the, his, agave, a, absalom, her, amdahl,
Nearest to called: massed, sadler, UNK, primigenius, naaman, abitibi, abet, adaptive,
Nearest to known: used, well, adele, seer, finnic, dasyprocta, epoxy, hmong,
Nearest to will: would, could, can, may, bosniaks, should, cpa, moravians,
Nearest to people: reginae, odes, coquitlam, music, pathological, intercepts, scenario, guru,
Nearest to nine: eight, seven, six, zero, four, five, three, dasyprocta,
Nearest to also: which, sometimes, zionist, thibetanus, generally, crispin, often, trinomial,
Nearest to eight: six, seven, nine, four, five, three, zero, dasyprocta,
Nearest to are: were, is, have, was, be, analogue, thibetanus, angiotensin,
Nearest to all: asterism, reuptake, two, dasyprocta, thibetanus, rhenish, many, expanded,
Average loss at step 52000 : 5.16474540925
Average loss at step 54000 : 5.10961878431
Average loss at step 56000 : 5.06780198526
Average loss at step 58000 : 5.11088050807
Average loss at step 60000 : 4.94124779272
Nearest to b: gland, microcebus, grants, d, circumstance, pron, abitibi, zero,
Nearest to s: his, zero, inches, and, michelob, recombine, vicarage, pottery,
Nearest to when: michelob, if, but, and, six, in, where, geostationary,
Nearest to seven: eight, six, five, four, nine, three, zero, two,
Nearest to of: for, in, microcebus, tamarin, thibetanus, and, abet, nine,
Nearest to system: empowerment, law, instructed, archers, microsite, tamarin, cartier, improved,
Nearest to its: their, the, his, tamarin, agave, her, absalom, ssbn,
Nearest to called: massed, sadler, tamarin, primigenius, michelob, naaman, abitibi, callithrix,
Nearest to known: used, well, adele, epoxy, finnic, microcebus, seer, hmong,
Nearest to will: would, could, can, may, should, to, moravians, bosniaks,
Nearest to people: reginae, odes, coquitlam, music, intercepts, pathological, cebus, saguinus,
Nearest to nine: eight, six, seven, five, four, zero, three, dasyprocta,
Nearest to also: which, sometimes, thibetanus, zionist, often, generally, tamarin, callithrix,
Nearest to eight: six, nine, seven, five, four, three, zero, two,
Nearest to are: were, is, have, angiotensin, be, kramnik, thibetanus, cebus,
Nearest to all: many, asterism, these, reuptake, two, thibetanus, dasyprocta, rhenish,
Average loss at step 62000 : 4.79670777971
Average loss at step 64000 : 4.79270891201
Average loss at step 66000 : 4.99029351902
Average loss at step 68000 : 4.88411666608
Average loss at step 70000 : 4.75195898664
Nearest to b: gland, grants, UNK, pron, d, seven, circumstance, microcebus,
Nearest to s: and, mitral, zero, inches, his, vicarage, michelob, holmes,
Nearest to when: if, michelob, but, before, where, was, during, six,
Nearest to seven: six, eight, five, four, nine, three, zero, one,
Nearest to of: for, in, microcebus, tamarin, same, iota, tabula, thibetanus,
Nearest to system: empowerment, law, improved, dinar, instructed, thaler, archers, conflict,
Nearest to its: their, his, the, tamarin, her, agave, ssbn, thaler,
Nearest to called: massed, UNK, tamarin, sadler, primigenius, michelob, naaman, mitral,
Nearest to known: used, well, epoxy, adele, such, microcebus, finnic, bug,
Nearest to will: would, could, can, may, should, must, moravians, to,
Nearest to people: reginae, odes, pathological, intercepts, coquitlam, cebus, members, saguinus,
Nearest to nine: eight, six, seven, five, four, zero, three, mitral,
Nearest to also: which, often, sometimes, zionist, thibetanus, generally, that, tamarin,
Nearest to eight: six, seven, nine, five, four, three, zero, michelob,
Nearest to are: were, is, have, be, thibetanus, while, angiotensin, was,
Nearest to all: many, these, some, asterism, reuptake, thibetanus, any, rhenish,
Average loss at step 72000 : 4.80778124154
Average loss at step 74000 : 4.75792721456
Average loss at step 76000 : 4.86112686592
Average loss at step 78000 : 4.79120120609
Average loss at step 80000 : 4.82245359421
Nearest to b: UNK, gland, d, seven, grants, microcebus, pron, david,
Nearest to s: zero, mitral, and, his, michelob, prohibition, inches, tamarin,
Nearest to when: if, michelob, but, before, pontificia, during, where, after,
Nearest to seven: six, eight, five, four, three, nine, zero, two,
Nearest to of: in, iota, tamarin, nine, microcebus, mitral, thibetanus, and,
Nearest to system: empowerment, improved, thaler, conflict, tamarin, instructed, dinar, microsite,
Nearest to its: their, his, the, tamarin, her, agave, topalov, thaler,
Nearest to called: massed, tamarin, naaman, michelob, sadler, UNK, mitral, primigenius,
Nearest to known: used, well, such, epoxy, adele, microcebus, bug, dasyprocta,
Nearest to will: would, could, can, may, should, must, moravians, to,
Nearest to people: reginae, pathological, odes, members, coquitlam, cebus, intercepts, saguinus,
Nearest to nine: eight, seven, six, five, four, zero, mitral, three,
Nearest to also: which, often, sometimes, zionist, generally, thibetanus, it, trinomial,
Nearest to eight: six, seven, five, nine, four, three, zero, michelob,
Nearest to are: were, is, have, be, thibetanus, while, pathfinder, cebus,
Nearest to all: many, these, some, asterism, two, reuptake, thibetanus, any,
Average loss at step 82000 : 4.79923895121
Average loss at step 84000 : 4.79056957233
Average loss at step 86000 : 4.7452732873
Average loss at step 88000 : 4.70395690095
Average loss at step 90000 : 4.76481224179
Nearest to b: d, gland, UNK, six, pron, microcebus, grants, david,
Nearest to s: his, mitral, and, zero, inches, clemency, michelob, tamarin,
Nearest to when: if, before, michelob, but, where, after, during, while,
Nearest to seven: eight, five, six, four, nine, three, zero, one,
Nearest to of: in, for, tamarin, same, nine, microcebus, msg, and,
Nearest to system: tamarin, thaler, improved, microsite, conflict, dinar, empowerment, instructed,
Nearest to its: their, his, the, her, tamarin, agave, celera, topalov,
Nearest to called: massed, tamarin, naaman, mitral, dreamers, michelob, sadler, UNK,
Nearest to known: used, well, such, epoxy, adele, bug, microcebus, hmong,
Nearest to will: would, can, could, may, must, should, moravians, cannot,
Nearest to people: reginae, members, pathological, odes, coquitlam, cebus, intercepts, saguinus,
Nearest to nine: eight, seven, six, five, four, zero, mitral, michelob,
Nearest to also: which, often, sometimes, zionist, generally, thibetanus, trinomial, now,
Nearest to eight: seven, six, five, nine, four, three, zero, two,
Nearest to are: were, is, have, be, thibetanus, while, include, pathfinder,
Nearest to all: many, some, these, thibetanus, dasyprocta, both, any, asterism,
Average loss at step 92000 : 4.72437152827
Average loss at step 94000 : 4.62979676688
Average loss at step 96000 : 4.71152837896
Average loss at step 98000 : 4.6148717382
Average loss at step 100000 : 4.676337744
Nearest to b: d, grants, gland, david, trailed, microcebus, circumstance, thaler,
Nearest to s: his, mitral, michelob, inches, clemency, medea, zero, tamarin,
Nearest to when: if, while, where, after, during, before, michelob, but,
Nearest to seven: eight, six, five, four, nine, three, zero, two,
Nearest to of: in, tamarin, and, thibetanus, for, microcebus, nine, eight,
Nearest to system: improved, systems, law, archers, microsite, thaler, conflict, tamarin,
Nearest to its: their, his, the, her, tamarin, agave, celera, topalov,
Nearest to called: massed, UNK, tamarin, naaman, interpreted, dreamers, mitral, fright,
Nearest to known: used, such, well, epoxy, microcebus, cryo, adele, bug,
Nearest to will: would, can, could, may, must, should, to, moravians,
Nearest to people: reginae, members, odes, pathological, coquitlam, cebus, intercepts, saguinus,
Nearest to nine: eight, seven, six, five, four, zero, three, dasyprocta,
Nearest to also: which, often, sometimes, zionist, generally, now, thibetanus, still,
Nearest to eight: seven, nine, five, six, four, three, zero, dasyprocta,
Nearest to are: were, is, have, while, be, include, pathfinder, thibetanus,
Nearest to all: many, these, some, thibetanus, both, any, asterism, several,
>>>

原文地址:https://www.cnblogs.com/weizhen/p/6276725.html