Anaconda 安装和配置

Anaconda 安装和配置

1. Anaconda 安装

Anaconda说明及安装过程Anaconda详细安装使用教程

Anaconda环境变量配置配置环境变量

2. Anaconda和Pip源修改

  • Anaconda源修改:打开Anaconda Prompt后,输入以下代码。
    conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
    conda config --set show_channel_urls yes
  • Pip源修改:在本地User用户目录新建pip目录,然后新建pip.ini文件,编辑如下代码后保存。
    [global] 
    index-url = https://pypi.tuna.tsinghua.edu.cn/simple

3. Anaconda常用命令

3.1 模块迁移

  • 当前环境安装的所有模块信息导出到名为requirements.txt文件中,该文件存放在当前用户目录下
    pip freeze > requirements.txt
  • 新环境中根据requirements.txt文件来安装模块。
    pip install -r C:UsersXXX
    equirements.txt

3.2 环境创建、激活和退出

  •  创建环境
    conda create -n env_name package_name=version
  • 激活环境

    (base) C:UsersAdministrator>activate superset

    (superset) C:UsersAdministrator>

  • 列出环境
    (base) C:UsersAdministrator>conda env list
    # conda environments:
    #
    base                  *  D:ProSoftwaresPythonAnaconda3
    python36                 D:ProSoftwaresPythonAnaconda3envspython36
    superset                 D:ProSoftwaresPythonAnaconda3envssuperset
  • 退出环境
    (superset) C:UsersAdministrator>conda deactivate
    
    (base) C:UsersAdministrator>

3.3 克隆环境

使用该方法,可以重命名环境:

(base) C:UsersAdministrator>conda create -n analysis --clone python36

然后删除原来的环境即可:

(base) C:UsersAdministrator>conda remove -n python36 --all

4. Anaconda安装superset环境(在线)

4.1 创建隔离环境

(base) C:UsersAdministrator>conda activate -n superset python==3.6

 创建一个隔离环境,防止和其它环境的包发生冲突。

4.2 安装VC++需求文件

进入superset环境后,尝试用pip install superset命令直接安装,最后提示Failed to build superset python-geohash错误,缺少编译环境,并提示下载:

error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools

上述下载地址失效,使用VC++14.0安装教程进行安装。安装完成后,重新使用pip install superset命令安装superset,则可正常安装:

Successfully installed cchardet-2.1.4 et-xmlfile-1.0.1 ijson-2.3 jdcal-1.4.1 jsonlines-1.2.0 linear-tsv-1.1.0 openpyxl-2.4.11 pure-sasl-0.6.1 python-geohash-0.8.5 rfc3986-1.3.1 simplejson-3.16.0 sqlalchemy-utils-0.33.11 sqlparse-0.3.0 superset-0.28.1 tableschema-1.4.1 tabulator-1.20.0 thrift-0.11.0 thrift-sasl-0.3.0 unicodecsv-0.14.1 unidecode-1.0.23 xlrd-1.2.0

4.3 配置superset

  • 创建superset管理员账号
    (superset) C:UsersAdministrator>fabmanager create-admin --app superset
    fabmanager is going to be deprecated in 2.2.X, you can use the same commands on the improved 'flask fab <command>'
    Username [admin]: admin
    User first name [admin]: Strive
    User last name [user]: Py
    Email [admin@fab.org]: strive@qq.com
    Password:
    Repeat for confirmation:
    Was unable to import superset Error: cannot import name '_maybe_box_datetimelike'

     出现Was unable to import superset Error: cannot import name '_maybe_box_datetimelike'错误,原因是pandas版本(0.24.2)太高,卸载重装0.23.4版本:

    pip uninstall pandas
    
    pip install pandas==0.23.4

    再进行管理员账号创建:

    (superset) C:UsersAdministrator>fabmanager create-admin --app superset
    fabmanager is going to be deprecated in 2.2.X, you can use the same commands on the improved 'flask fab <command>'
    Username [admin]: admin
    User first name [admin]: Strive
    User last name [user]: Py
    Email [admin@fab.org]: strive@qq.com
    Password:
    Repeat for confirmation:
    Recognized Database Authentications.
    Admin User admin created.
  • 初始化数据库需要使用python superset命令,该命令需要进入superset包的bin目录(D:ProSoftwaresPythonAnaconda3envssupersetLibsite-packagessupersetin)下执行:

    (superset) D:ProSoftwaresPythonAnaconda3envssupersetLibsite-packagessupersetin>python superset
    Usage: superset [OPTIONS] COMMAND [ARGS]...
    
      This is a management script for the superset application.
    
    Options:
      --version  Show the flask version
      --help     Show this message and exit.
    
    Commands:
      db                        Perform database migrations.
      export_dashboards         Export dashboards to JSON
      export_datasource_schema  Export datasource YAML schema to stdout
      export_datasources        Export datasources to YAML
      fab                       FAB flask group commands
      flower                    Runs a Celery Flower web server Celery Flower...
      import_dashboards         Import dashboards from JSON
      import_datasources        Import datasources from YAML
      init                      Inits the Superset application
      load_examples             Loads a set of Slices and Dashboards and a...
      load_test_users           Loads admin, alpha, and gamma user for...
      refresh_druid             Refresh druid datasources
      run                       Runs a development server.
      runserver                 Starts a Superset web server.
      shell                     Runs a shell in the app context.
      update_datasources_cache  Refresh sqllab datasources cache
      version                   Prints the current version number
      worker                    Starts a Superset worker for async SQL query...
  • 使用python superset db upgrade命令更新数据库,出现sqlalchemy.exc.InvalidRequestError: Can't determine which FROM clause to join from, there are multiple FROMS which can  join to this entity. Try adding an explicit ON clause to help resolve the ambiguity.错误,原因是sqlalchemy包版(1.3.3)本太高,卸载重装1.2.0版本,就可以成功进行数据库更新操作。
  • 使用python superset load_examples命令加载样例模板。
  • 使用python superset init命令初始化用户角色和权限。
  • 使用python superset runserver 命令启动服务报错,原因是superset使用gunicorn作为应用程序服务器,而gunicorn不支持windows,命令行中添加-d,使用development web server运行。最终运行命令为:python superset runserver -d

最后在浏览器中访问:localhost:8088就可以打开superset登录页面。

4.4 Superset数据库查询报错

因为superset是为Linux和Mac服务的,Windows下缺失某些系统依赖包,所以进行数据库查询时,会提示'Module 'signal' has no attribute 'SIGALRM',并且查询不到数据,解决办法是修改superset安装目录下的utils.py(D:ProSoftwaresPythonAnaconda3envssupersetLibsite-packagessupersetutils.py)文件中关于signal提示的代码。用文本编辑器打开utils.py,找到如下代码:

    def __enter__(self):
        try:
            signal.signal(signal.SIGALRM, self.handle_timeout)
            signal.alarm(self.seconds)
        except ValueError as e:
            logging.warning("timeout can't be used in the current context")
            logging.exception(e)

    def __exit__(self, type, value, traceback):
        try:
            signal.alarm(0)
        except ValueError as e:
            logging.warning("timeout can't be used in the current context")
            logging.exception(e)

然后将代码修改为:

    def __enter__(self):
        try:
            # signal.signal(signal.SIGALRM, self.handle_timeout)
            # signal.alarm(self.seconds)
            pass
        except ValueError as e:
            logging.warning("timeout can't be used in the current context")
            logging.exception(e)

    def __exit__(self, type, value, traceback):
        try:
            # signal.alarm(0)
            pass
        except ValueError as e:
            logging.warning("timeout can't be used in the current context")
            logging.exception(e)

然后刷新superset即可。

5 Anaconda安装Superset环境(离线)

由于在线安装出现的问题太多,所以采取离线手动安装的方式。

5.1 使用Pip安装依赖包

在Github源码中找到依赖包文件requirements.txt

然后使用Pip安装依赖包:

pip install -r C:UsersXXX
equirements.txt

如果中途发生版本不匹配问题,修改对应包版本再继续安装即可。

5.2 使用Pip安装Superset离线包

在官网下载superset-0.34.0安装包

然后使用Pip安装superset包:

pip install C:UsersXXXapache-superset-0.34.0.tar.gz

5.3 配置Superset

基本参考在线安装,需要注意:

  • 设置用户名时不能设置admin,会提示唯一字段重复的sql错误。
  • 启动服务时,在bin目录内使用命令python superset run即可。

5.4 Superset数据库查询报错

因为superset是为Linux和Mac服务的,Windows下缺失某些系统依赖包,所以进行数据库查询时,会提示'Module 'signal' has no attribute 'SIGALRM',并且查询不到数据,解决办法是修改superset安装目录下的core.py(D:ProsoftwaresPythonAnaconda3envssupersetLibsite-packagessupersetutilscore.py)中关于signal提示的代码(579行),按照4.4修改后,重启服务即可。

6 参考资料 

原文地址:https://www.cnblogs.com/strivepy/p/10606856.html