python+jupyter安装

动态库依赖

 
ImportError: libffi.so.7: cannot open shared object file: No such file or directory

类似这样库文件找不到
- 没有安装
- 安装了,但版本不对 

动态库,不在安装包里,在系统上,迁移文件时,如果不是当初那个动态库版本,那么就会报错
    

需要libffi.so.7,但安装的是8,这说明所运行的文件相对当前环境已经很旧了
root@lm:/opt/wks/bigmodel/chatglm3-6b# find /usr/ -name libffi.so
/usr/lib/x86_64-linux-gnu/libffi.so
root@lm:/opt/wks/bigmodel/chatglm3-6b# find /usr/ -name libffi.so.*
/usr/lib/x86_64-linux-gnu/libffi.so.8
/usr/lib/x86_64-linux-gnu/libffi.so.8.1.0

安装前必读

 
本文档为原生安装,需要先源码编译python,再安装依赖包,因此需要注意以下几点
- 系统依赖检查是必须的
  - 如果系统没有安装某个依赖包,后面不管是python用到还是依赖包用到,都会报错
- 系统依赖包是否安全了?
  - 全不全,是相对你有没用来说的
  - 比如torchvision依赖OS的lzma,但你没用到torchvision,OS上没安装lzma也没有关系

- 如果在pip安装某个依赖包时,发现是因为缺少OS包导致的
  - 此时就需要在OS上安装该依赖包
  - 然后重新编译python,没错,这一步是麻烦的
  - 编译好之后,在迁移到新系统时,未必就需要在新OS安装相关依赖包
    - 这要看怎么用,
    - 但也不全是,有时只要编译时安装好,用的时候已经集成了,
    - 此时,迁移后就不需要在新OS上安装依赖包了,因为不会用OS上的这个功能了
    - 用的是编译好的python的功能

 
docker run -itd --privileged --name py3 -h py3 --net=host -v /tmp:/tmp -v /media:/media -v /media/xt/tpf/tpf:/opt/tpf  cent7  bash

alias py3="docker exec -it py3 bash"


系统系统包安装

 
yum install -y gcc gcc-c++ kernel-devel
yum install -y openssl-devel zlib-devel

rpm -qa | grep openssl-devel
rpm -qa | grep zlib-devel
rpm -qa | grep bzip2-devel
rpm -qa | grep libffi-devel
rpm -qa |grep xz-devel
rpm -qa |grep python-backports-lzma

mkdir /data
cd /data 
rsync -rltDv /media/xt/tpf/soft/jupyter.tar.gz ./
tar -xvf jupyter.tar.gz
cd jupyter/soft/rpm/

rpm -Uvh --force --nodeps *.rpm

adduser py39 
chown -R py39.py39 /data/jupyter

 
su - py39

vim .bash_profile
配置python环境变量,告诉jupyter python库目录在哪
export PYTHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile


rsync -rltDv /data/jupyter/lib/.local ~/
rsync -rltDv /data/jupyter/lib/.jupyter ~/
    

 
anaconda从2020年开始商业收费了,所以不用conda了... 特别是公司规模超过200人时... 

python下载
https://www.python.org/ftp/python/3.9.19/Python-3.9.19.tar.xz

docker run -itd --privileged --name pyenv -h pyenv --net=host -v /opt:/opt -v /tmp:/tmp -v /media:/media -v /data:/data cent7  bash

docker start pyenv
docker exec -it pyenv bash 
yum -y install gcc gcc-c++ kernel-devel

 


 


 

yum install gcc openssl-devel bzip2-devel

yum install libffi-devel -y

rpm -qa | grep openssl-devel
rpm -qa | grep bzip2-devel
rpm -qa | grep zlib-devel
rpm -qa | grep libffi-devel

yum search openssl-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ openssl-devel.x86_64

yum search bzip2-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ bzip2-devel.x86_64

yum search libffi-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ libffi-devel.x86_64



rpm -Uvh --force --nodeps *.rpm


https://www.sqlite.org/download.html

https://www.sqlite.org/2025/sqlite-src-3480000.zip


wget https://www.sqlite.org/snapshot/sqlite-snapshot-202404051413.tar.gz --no-check-certificate

sqlite通常被很多程序依赖,但它的安装在python之前
tar -xvf sqlite-snapshot-202404051413.tar.gz
cd sqlite-snapshot-202404051413
./configure -prefix=/data/jupyter/sqlite3
make
make install


重新编译安装python3
tar -xvf Python-3.9.19.tar.xz 
cd Python-3.9.19
vim setup.py 
在下面这段的下一行添加’/media/xt/san/tpf/aiwks/app/ubantu/sqlite3/include’,
sqlite_inc_paths = [ '/usr/include',
                                '/usr/include/sqlite',
                                '/usr/include/sqlite3',
                                '/usr/local/include',
                                '/usr/local/include/sqlite',
                                '/usr/local/include/sqlite3',
                                '/data/jupyter/sqlite3/include',
                                ]

./configure --prefix=/data/jupyter/python/
make 
sudo make install

chown -R py39.py39 /data/jupyter/
useradd py39
su - py39 
vim .bash_profile 
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile 

[py39@pyenv ~]$ which python
/bin/python
[py39@pyenv ~]$ which python3
/data/jupyter/python/bin/python3


[py39@pyenv ~]$ cd /data/jupyter/python/bin/
[py39@pyenv bin]$ ls
2to3  2to3-3.9  idle3  idle3.9  pip3  pip3.9  pydoc3  pydoc3.9  python3  python3-config  python3.9  python3.9-config
[py39@pyenv bin]$ ln -s python3 python
[py39@pyenv bin]$ which python
/data/jupyter/python/bin/python


vim .bash_profile 
export PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PYTHONIOENCODING=UTF-8
. ./.bash_profile 


 
[py39@pyenv ~]$ cd /data/jupyter/python/bin/
[py39@pyenv bin]$ ls
2to3  2to3-3.9  idle3  idle3.9  pip3  pip3.9  pydoc3  pydoc3.9  python  python3  python3-config  python3.9  python3.9-config
[py39@pyenv bin]$ ln -s pip3 pip
[py39@pyenv bin]$ which pip
/data/jupyter/python/bin/pip

    
pip install jupyter 


 
mkdir -p /data/jupyter/{python,soft/rpm,wks}
useradd py39
chown -R py39.py39 /data/jupyter 

cd /data/jupyter/soft 
wget https://www.python.org/ftp/python/3.9.19/Python-3.9.19.tar.xz

tar -xvf sqlite-snapshot-202404051413.tar.gz
cd sqlite-snapshot-202404051413
./configure -prefix=/data/jupyter/sqlite3
make 
make install 



tar -xvf Python-3.9.19.tar.xz 
cd Python-3.9.19
vim setup.py 
在下面这段的下一行添加'/data/jupyter/sqlite3/include'',
sqlite_inc_paths = [ '/usr/include',
                                '/usr/include/sqlite',
                                '/usr/include/sqlite3',
                                '/usr/local/include',
                                '/usr/local/include/sqlite',
                                '/usr/local/include/sqlite3',
                                '/data/jupyter/sqlite3/include',
                                ]

./configure --prefix=/data/jupyter/python/
make 
sudo make install

 
chown -R py39.py39 /data/jupyter/
su - py39 
vim .bash_profile 
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

cd /data/jupyter/python/bin/
ln -s python3 python
ln -s pip3 pip



[py39@fine-bump-3 bin]$ which python
/data/jupyter/python/bin/python
[py39@fine-bump-3 bin]$ which pip
/data/jupyter/python/bin/pip

 
pip install traitlets==5.9.0

pip install jupyter 

ImportError: urllib3 v2 only supports OpenSSL 1.1.1+
pip uninstall urllib3
pip install urllib3==1.25.8


    

jupyter远程配置

 
ssh -p 26225 144.34.185.72

配置python环境变量,告诉jupyter python库目录在哪
PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages

jupyter notebook password #设置 jupyter 的密码
jupyter notebook --generate-config #生成自己配置文件,目录在 ~/.jupyter/jupyter_notebook_config.py

示例
jupyter notebook password
book_1234

$ jupyter notebook --generate-config
vim ~/.jupyter/jupyter_notebook_config.py
c.NotebookApp.ip='*'
c.NotebookApp.open_browser=False
c.NotebookApp.port=8888
    

 
docker run -itd --privileged --name py1 -h py1 --net=host -v /tmp:/tmp -v /media:/media cent7  bash

alias py1="docker exec -it py1 bash"
yum -y install gcc gcc-c++ kernel-devel

rpm -qa | grep openssl-devel
rpm -qa | grep zlib-devel

rpm -qa | grep bzip2-devel
rpm -qa | grep libffi-devel


yum search bzip2-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ bzip2-devel.x86_64

yum search libffi-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ libffi-devel.x86_64

yum install -y openssl-devel zlib-devel

rpm -Uvh --force --nodeps *.rpm

 
ssh -p 26225 144.34.185.72
useradd py39
mkdir /data 
cd /data 
rsync -e 'ssh -p26225' -avP py39@144.34.185.72:/data/jupyter ./ 

>>> import jupyter_core
>>> jupyter_core.__file__
'/home/py39/.local/lib/python3.9/site-packages/jupyter_core/__init__.py'

cd
rsync -e 'ssh -p26225' -avP py39@144.34.185.72:/home/py39/.local ./

 
chown -R py39.py39 /data/jupyter/

su - py39 
配置python环境变量,告诉jupyter python库目录在哪
export PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

[py39@fine-bump-3 ~]$ which pip
/data/jupyter/python/bin/pip
[py39@fine-bump-3 ~]$ which python
/data/jupyter/python/bin/python


pip install --upgrade pip
pip install traitlets==5.9.0
pip install jupyter

ImportError: urllib3 v2 only supports OpenSSL 1.1.1+
pip uninstall urllib3
pip install urllib3==1.25.8

pip install notebook==6.4.12


 
jupyter notebook password #设置 jupyter 的密码
jupyter notebook --generate-config #生成自己配置文件,目录在 ~/.jupyter/jupyter_notebook_config.py

示例
jupyter notebook password
book_1234

$ jupyter notebook --generate-config
vim ~/.jupyter/jupyter_notebook_config.py
c.NotebookApp.ip='*'
c.NotebookApp.open_browser=False
c.NotebookApp.port=8008

 
[py39@py1 ~]$ cd /data/jupyter/wks/
[py39@py1 wks]$ jupyter notebook
    

 
rsync -avP ~/.local /data/jupyter/lib/
rsync -avP ~/.jupyter /data/jupyter/lib/

 
docker创建
docker run -itd --privileged --name py2 -h py2 --net=host -v /tmp:/tmp -v /media:/media cent7  bash
alias py2="docker exec -it py2 bash"

系统依赖包
yum -y install gcc gcc-c++ kernel-devel
rpm -qa | grep openssl-devel
rpm -qa | grep zlib-devel
rpm -qa | grep bzip2-devel
rpm -qa | grep libffi-devel

本环境已安装openssl-devel,zlib-devel
yum install -y openssl-devel zlib-devel


  
adduser py39 
mkdir -p /data/
cd /data 
rsync -rltDv /media/xt/tpf/soft/jupyter.tar.gz ./

安装bzip2-devel,libffi-devel

在线安装
yum install -y bzip2-devel,libffi-devel

或离线安装
cd /data/jupyter/soft/rpm/
rpm -Uvh --force --nodeps bzip2-devel-1.0.6-13.el7.x86_64.rpm libffi-3.0.13-19.el7.x86_64.rpm libffi-devel-3.0.13-19.el7.x86_64.rpm

chown -R py39.py39 /data/jupyter/

oracle instant-client
https://www.oracle.com/cn/database/technologies/instant-client/linux-x86-64-downloads.html


 
su - py39 
配置python环境变量,告诉jupyter python库目录在哪

vim .bash_profile
export PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

[py39@py2 ~]$ which pip
/data/jupyter/python/bin/pip
[py39@py2 ~]$ which python
/data/jupyter/python/bin/python


rsync -rltDv /data/jupyter/lib/.local ~/
rsync -rltDv /data/jupyter/lib/.jupyter ~/


 
连接数据库,自定义组件迁移 

 
docker创建
docker run -itd --privileged --name py2 -h py2 --net=host -v /tmp:/tmp -v /media:/media cent7  bash
alias py2="docker exec -it py2 bash"

系统依赖包
yum install -y gcc gcc-c++ kernel-devel
yum install -y openssl-devel zlib-devel

adduser py39 

系统依赖包检查

 
如果python是编译安装的,安装完后才发现缺少OS依赖包,就必须进行以下步骤 
1. 安装OS依赖包
2. 重新编译安装python

    

ModuleNotFoundError: No module named '_lzma'

 
torchvision使用 
rpm -qa |grep xz-devel
rpm -qa |grep python-backports-lzma
    
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ xz-devel.x86_64
rpm -Uvh --force --nodeps xz*.rpm


yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ python-backports-lzma.x86_64
rpm -Uvh --force --nodeps python*.rpm


pip3 install backports.lzma

系统依赖包

 



yum install xz-devel -y
yum install python-backports-lzma -y
pip install backports.lzma

yum search backports.lzma





>>> import torch
>>> torch.__version__
'2.1.1'

import torch
torch.__version__

import torch
torch.__version__>>> 
>>> '2.2.2+cu121'
>>> 

>>> import torch
torch.__version__
>>> '2.1.1+cu121'

pip install torchaudio==2.1.1
pip install torchvision==0.16.1
pip install torchvision==0.16.1


ssh -p 26225 144.34.185.72
rsync -e 'ssh -p26225' -avP /data/jupyter/python py39@144.34.185.72:/data/jupyter/

rsync -e 'ssh -p26225' -avP /data/jupyter py39@144.34.185.72:/data/


 
mkdir /data 
cd /data
rsync -rltDv /media/xt/tpf/soft/jupyter.tar.gz ./
tar -xvf jupyter.tar.gz
cd /data/jupyter/soft/rpm/
rpm -ivh oracle-instantclient-basic-21.13.0.0.0-1.x86_64.rpm 

chown -R py39.py39 /data/jupyter

su - py39

vim .bash_profile
配置python环境变量,告诉jupyter python库目录在哪
export PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile


rsync -rltDv /data/jupyter/lib/.local ~/
rsync -rltDv /data/jupyter/lib/.jupyter ~/


 
系统系统包安装
yum install -y gcc gcc-c++ kernel-devel
yum install -y openssl-devel zlib-devel

rpm -qa | grep openssl-devel
rpm -qa | grep zlib-devel
rpm -qa | grep bzip2-devel
rpm -qa | grep libffi-devel
rpm -qa |grep xz-devel
rpm -qa |grep python-backports-lzma

yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ xz-devel.x86_64
rpm -Uvh --force --nodeps xz*.rpm

yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ python-backports-lzma.x86_64
rpm -Uvh --force --nodeps python*.rpm

rpm -Uvh --force --nodeps *.rpm

 
python安装
tar -xvf Python-3.9.19.tar.xz 
cd Python-3.9.19
vim setup.py 
在下面这段的下一行添加'/data/jupyter/sqlite3/include'',
sqlite_inc_paths = [ '/usr/include',
                                '/usr/include/sqlite',
                                '/usr/include/sqlite3',
                                '/usr/local/include',
                                '/usr/local/include/sqlite',
                                '/usr/local/include/sqlite3',
                                '/data/jupyter/sqlite3/include',
                                ]

./configure --prefix=/data/jupyter/python/
make 
sudo make install

[root@fine-bump-3 Python-3.9.19]# cd /data/jupyter/python/bin/
[root@fine-bump-3 bin]# ls
2to3  2to3-3.9  idle3  idle3.9  pip3  pip3.9  pydoc3  pydoc3.9  python3  python3.9  python3.9-config  python3-config
[root@fine-bump-3 bin]# ln -s pip3 pip
[root@fine-bump-3 bin]# ln -s python3 python


chown -R py39.py39 /data/jupyter

su - py39

vim .bash_profile
配置python环境变量,告诉jupyter python库目录在哪
export PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile
    

 
pip install --upgrade pip
pip install backports.lzma
pip install traitlets==5.9.0
pip install urllib3==1.25.8  #为了兼容低版本的openssl
pip install jupyter
pip install notebook==6.4.12  #此步为是了对齐之前写的文档,非技术必需

 
pip install pandas
pip install joblib
pip install scikit-learn
pip install hmmlearn
pip install sklearn_crfsuite
pip install chinese_calendar
pip install matplotlib
pip install pydotplus
pip install jieba
pip install preprocess

pip install xgboost
pip install xgboost -i https://pypi.tuna.tsinghua.edu.cn/simple 
pip install xgboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install catboost
pip install catboost -i https://pypi.tuna.tsinghua.edu.cn/simple #安装提速
pip install catboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install sklearn-pandas
pip install --user --upgrade git+https://github.com/jpmml/sklearn2pmml.git

pip install requests
pip install Flask
pip install gevent 
pip install Flask-APScheduler

pip install cx_Oracle 
pip install pymysql==1.0.2
pip install sqlalchemy

pip install --user  -i https://pypi.tuna.tsinghua.edu.cn/simple py2neo

pip install lightgbm -i https://pypi.tuna.tsinghua.edu.cn/simple 

 
pip install torch torchvision torchaudio

pip install onnx onnxruntime

import torchvision  #这个容易因为缺少系统依赖而报错

import torch
torch.__version__

2.2.2+cu121
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.2.2+cu121.html

    

vim /data/jupyter/python/lib/python3.9/lzma.py

>>> import torchvision
注意:引入不报错就不用修改源码了

#修改前
from _lzma import *
from _lzma import _encode_filter_properties, _decode_filter_properties

#修改后 
try:
    from _lzma import *
    from _lzma import _encode_filter_properties, _decode_filter_properties
except ImportError:
    from backports.lzma import *
    from backports.lzma import _encode_filter_properties, _decode_filter_properties
    
    

 

    ssh -p 26225 144.34.185.72
rsync -e 'ssh -p26225' -avP /data/jupyter/python py39@144.34.185.72:/data/jupyter/

rsync -e 'ssh -p26225' -avP /data/jupyter py39@144.34.185.72:/data/


rsync -e 'ssh -p26225' -avP /data/jupyter/python py39@144.34.185.72:/data/jupyter/

rsync -e 'ssh -p26225' -avP py39@144.34.185.72:/data/jupyter /data/ 



    rsync -rltDv /data/jupyter/lib/.local ~/
    rsync -rltDv /data/jupyter/lib/.jupyter ~/
    

 
迁移到线上,进行离线安装时发现系统上缺少openssl-libs包 


docker run -itd --privileged --name py2 -h py2 --net=host -v /tmp:/tmp -v /media:/media cent7  bash

alias py2="docker exec -it py2 bash"

需要创建一个空系统,因为如果系统上已经安装这个包,则不会再安装 
yum search openssl-libs
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ openssl-libs.x86_64

yumdownloader --resolve --destdir=/root/mypackages/ openssl-libs.x86_64
    
yum install yum-utils
yumdownloader package-name --resolve --destdir=/path/to/directory --releasever=version

yumdownloader openssl-libs.x86_64 --resolve --destdir=/data/jupyter/soft/rpm/ --releasever=1.0.0

yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ --releasever=1.0.0 openssl-libs.x86_64

import _ssl ImportError: libssl.so.10: cannot open shared object file: No such file or directory 


 

    

 

    

 

    

 
jupyter已安装,开始安装其他包 
soft/rpm需要双向同步,远程的上rpm包少了一些,sqlite等,

 



ssh -p 26225 144.34.185.72
rsync -e 'ssh -p26225' -avP /data/jupyter/python py39@144.34.185.72:/data/jupyter/

rsync -e 'ssh -p26225' -avP 

 

sudo yum install sqlite-devel

sqlite从Python3.10开始不支持修改setup.py了,要下载安装
可以尝试,不安装sqlite,直接先用编译好的,不行再在OS上安装sqlite3 
yum search sqlite-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ sqlite-devel.x86_64
rpm -Uvh --force --nodeps xz*.rpm
rpm -Uvh --force --nodeps sqlite*.rpm

# rpm -qa|grep openssl
openssl-devel-1.0.2k-26.el7_9.x86_64
openssl-1.0.2k-26.el7_9.x86_64
openssl-libs-1.0.2k-26.el7_9.x86_64

    
cd /data/jupyter/soft
wget https://www.openssl.org/source/openssl-1.1.1w.tar.gz --no-check-certificate
tar -xvf openssl-1.1.1w.tar.gz 
cd openssl-1.1.1w

./config shared --prefix=/data/jupyter/ssl/
sudo make 
sudo make install 

mkdir lib 
cp ./*.{so,so.1.1,a,pc} ./lib 
    
https://blog.csdn.net/weixin_45141207/article/details/132836256

 
https://www.python.org/ftp/python/3.9.19/Python-3.9.19.tar.xz
https://www.python.org/ftp/python/3.10.14/Python-3.10.14.tar.xz



cd Python-3.9.19
#--enable-optimizations 



./configure --prefix=/data/jupyter/python/ --with-openssl=/data/jupyter/ssl --with-libs=/data/jupyter/ssl/lib/

./configure --prefix=/data/jupyter/python/ --with-openssl=/data/jupyter/ssl --with-openssl-rpath=auto --disable-ipv6

--with-openssl-rpath 
- 这个选项是在 Python 3.10 版本中引入的

变数在于--with-openssl
- 该参数目前发现有指向多个位置的,没试成功的就不说了
- 指定ssl安装目录报以下错误,原来怀疑是自己的python版本太低,本次安装为python3.9
    Following modules built successfully but were removed because they could not be imported:
    _hashlib              _ssl    
  - 之所以怀疑是python版本太低,是因为配置生效了,它能根据这个路径找到相关文件 
    checking for openssl/ssl.h in /data/jupyter/ssl... yes                         
- 然后升级到python到3.10.14,依然无法引入_ssl 

checking for openssl/ssl.h in /data/jupyter/ssl... yes
checking whether compiling and linking against OpenSSL works... yes
checking for --with-openssl-rpath... auto
checking whether OpenSSL provides required APIs... yes
checking for --with-ssl-default-suites... python




sudo make 
sudo make install 
    
with-openssl指OpenSSL库的实际安装位置


cd /data/jupyter/python/bin/
ln -s pip3 pip
ln -s python3 python
ls
2to3  2to3-3.10  idle3  idle3.10  pip  pip3  pip3.10  pydoc3  pydoc3.10  python  python3  python3.10  python3.10-config  python3-config



 
adduser py39 
chown -R py39.py39 /data/jupyter

su - py39 
vim .bash_profile
配置python环境变量,告诉jupyter python库目录在哪
export PYTHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

import ssl 
ssl.OPENSSL_VERSION 
'OpenSSL 1.1.1w  11 Sep 2023'


 
pip install --upgrade pip
pip install backports.lzma
pip install traitlets==5.9.0


pip install jupyter
或
pip install urllib3==1.25.8  #为了兼容低版本的openssl
pip install jupyter
pip install notebook==6.4.12  #此步为是了对齐之前写的文档,非技术必需
    

 
pip install pandas
pip install joblib

pip install sklearn-pandas
pip install --user --upgrade git+https://github.com/jpmml/sklearn2pmml.git

pip install scikit-learn
pip install hmmlearn
pip install sklearn_crfsuite
pip install chinese_calendar
pip install matplotlib
pip install pydotplus
pip install jieba
pip install preprocess

pip install xgboost
pip install xgboost -i https://pypi.tuna.tsinghua.edu.cn/simple 
pip install xgboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install catboost
pip install catboost -i https://pypi.tuna.tsinghua.edu.cn/simple #安装提速
pip install catboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com


pip install requests
pip install Flask
pip install gevent 
pip install Flask-APScheduler

pip install cx_Oracle 
pip install pymysql==1.0.2
pip install sqlalchemy

pip install --user  -i https://pypi.tuna.tsinghua.edu.cn/simple py2neo

pip install lightgbm -i https://pypi.tuna.tsinghua.edu.cn/simple 

    

 
pip install torch torchvision torchaudio

pip install onnx onnxruntime

import torchvision  #这个容易因为缺少系统依赖而报错

import torch
torch.__version__

2.2.2+cu121
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.2.2+cu121.html

    

 

mkdir /data
cd /data 

rsync -e 'ssh -p26225' -avP py39@144.34.185.72:/data/jupyter /data/ 

adduser py39 
chown -R py39.py39 /data/jupyter

su - py39
    
vim .bash_profile
配置python环境变量,告诉jupyter python库目录在哪
export PYTHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

rsync -rltDv /data/jupyter/lib/.jupyter ~/

 

    

 
yum install -y gcc gcc-c++ kernel-devel

userdel py39
rm -rf /home/py39/
adduser py39
rm -rf python/
mkdir python

    
cd Python-3.10.14
./configure --prefix=/data/jupyter/python/ --with-openssl=/data/jupyter/ssl --with-openssl-rpath=auto
make 
make install

cd /data/jupyter/python/bin/
ln -s pip3 pip
ln -s python3 python

chown -R py39.py39 /data/jupyter/
su - py39

vim .bash_profile
export PYTHONPATH=/data/jupyter/python/lib/python3.10/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

import ssl 
ssl.OPENSSL_VERSION 
'OpenSSL 1.1.1w  11 Sep 2023'

>>> import sqlite3
>>> 




 
pip install --upgrade pip
pip install backports.lzma
pip install traitlets==5.9.0
pip install jupyter


 
pip install pandas
pip install joblib

pip install sklearn-pandas
pip install --user --upgrade git+https://github.com/jpmml/sklearn2pmml.git

pip install scikit-learn
pip install hmmlearn
pip install sklearn_crfsuite
pip install chinese_calendar
pip install matplotlib
pip install pydotplus
pip install jieba
pip install preprocess

pip install xgboost
pip install xgboost -i https://pypi.tuna.tsinghua.edu.cn/simple 
pip install xgboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install catboost
pip install catboost -i https://pypi.tuna.tsinghua.edu.cn/simple #安装提速
pip install catboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com


pip install requests
pip install Flask
pip install gevent 
pip install Flask-APScheduler

pip install cx_Oracle 
pip install pymysql==1.0.2
pip install sqlalchemy

pip install --user  -i https://pypi.tuna.tsinghua.edu.cn/simple py2neo

pip install lightgbm -i https://pypi.tuna.tsinghua.edu.cn/simple 

 
pip install torch torchvision torchaudio

pip install onnx onnxruntime

import torchvision  #这个容易因为缺少系统依赖而报错

import torch
torch.__version__

2.2.2+cu121
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.2.2+cu121.html


 
mkdir /data
cd /data 
adduser py39

rsync -e 'ssh -p26225' -avP py39@144.34.185.72:/data/jupyter /data/ 

chown -R py39.py39 /data/jupyter/
su - py39
rsync -rltDv /data/jupyter/lib/.jupyter ~/


vim .bash_profile
配置python环境变量,告诉jupyter python库目录在哪
export PATHHONPATH=/data/jupyter/python/lib/python3.9/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile


 

    

 
docker run -itd --privileged --name py2 -h py2 --net=host -v /tmp:/tmp -v /media:/media cent7  bash

alias py2="docker exec -it py2 bash"

安装时必需

 
yum install -y gcc gcc-c++ kernel-devel
yum install -y openssl-devel zlib-devel

sudo yum install sqlite-devel
python3.9可修改setup.py,python3.10只能yum install 

yum search sqlite-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ sqlite-devel.x86_64



rpm -qa |grep sqlite-devel
rpm -qa | grep openssl-devel
rpm -qa | grep zlib-devel
rpm -qa | grep bzip2-devel
rpm -qa | grep libffi-devel
rpm -qa |grep xz-devel
rpm -qa |grep python-backports-lzma


运行时必需

 
gcc是编译包的,运行时不需要
ssl 编译进python运行时就不需要安装
sqlite3也一样 

 

mkdir /data
cd /data 
rsync -rltDv /media/xt/tpf/soft/jupyter.tar.gz ./


yum search sqlite-devel
yum install --downloadonly --downloaddir=/data/jupyter/soft/rpm/ sqlite-devel.x86_64

rpm -Uvh --force --nodeps *.rpm
    

ssl编译安装

 
cd /data/jupyter/soft
wget https://www.openssl.org/source/openssl-1.1.1w.tar.gz --no-check-certificate
tar -xvf openssl-1.1.1w.tar.gz 
cd openssl-1.1.1w

./config shared --prefix=/data/jupyter/ssl/
sudo make 
sudo make install 

mkdir lib 
cp ./*.{so,so.1.1,a,pc} ./lib 

sqlite3

 
sudo yum install sqlite-devel
python3.9可修改setup.py,python3.10只能yum install 

其他依赖包

 
rpm -qa | grep openssl-devel   : jupyter 
rpm -qa | grep zlib-devel
rpm -qa | grep bzip2-devel
rpm -qa | grep libffi-devel
rpm -qa |grep xz-devel          :torchvision 
rpm -qa |grep python-backports-lzma

只要用的依赖包对应的os依赖包安装即可,不用的话可以不安装 

python编译安装

 

ssl编译安装目录,
--with-openssl=/data/jupyter/ssl/

python3.10才有的参数,协助寻找ssl库,没有这个可能无法引入ssl模块
--with-openssl-rpath=auto 

https://www.python.org/ftp/python/3.10.14/Python-3.10.14.tar.xz

./configure --prefix=/data/jupyter/python/ --with-openssl=/data/jupyter/ssl --with-openssl-rpath=auto 
sudo make 
sudo make install


 

cd /data/jupyter/python/bin/
ln -s pip3 pip
ln -s python3 python
ls

 


docker创建

   
docker run -itd --privileged --name py3 -h py3 --net=host -v /tmp:/tmp -v /media:/media -v /media/xt/tpf/tpf:/opt/tpf  cent7  bash

alias py3="docker exec -it py3 bash"

解压

 
mkdir /data
cd /data 
rsync -rltDv /media/xt/tpf/soft/jupyter.tar.gz ./
tar -xvf jupyter.tar.gz

用户及环境配置

 
adduser py39 
chown -R py39.py39 /data/jupyter 
su - py39 

vim .bash_profile
export PYTHONPATH=/data/jupyter/python/lib/python3.10/site-packages
export PATH=/data/jupyter/python/bin:$PATH
. ./.bash_profile

rsync -rltDv /data/jupyter/lib/.jupyter ~/
    

启动

 
cd /data/jupyter/wks/
jupyter notebook
    
    

mysql

 
RuntimeError: 'cryptography' package is required for sha256_password or caching_sha2_password auth methods

pip3 install cryptography

pip3 install cryptography -i https://pypi.tuna.tsinghua.edu.cn/simple

如果是下面这样使用mysql_native_password方式创建的密码,则不需要安装这个包 
create user 'admin'@'%' identified by '2y';
GRANT ALL PRIVILEGES ON *.* TO 'admin'@'%'  WITH GRANT OPTION;
ALTER USER 'admin'@'%' IDENTIFIED WITH mysql_native_password BY '2y';
    

jupyter 无法删除文件

 
chown py39 /data 

不加-R参数 

工作目录在/data/jupyter/wks 
但回收的临时文件却在/data目录下 
    

import sklearn2pmml失败

 
当初是安装过了的,但迁移到线上就好像没安装过似的
经排查,当时线下安装到了home目录下的.local文件中
    

 
import sys
sys.path
['',
 '/data/jupyter/python/lib/python3.10/site-packages',
 '/data/jupyter/python/lib/python310.zip',
 '/data/jupyter/python/lib/python3.10',
 '/data/jupyter/python/lib/python3.10/lib-dynload',
 '/home/py39/.local/lib/python3.10/site-packages']
    
默认的包引入路径有6个,最后一个在home目录下
通过若要安装一个包,如果安装用户没有访问python安装目录权限的时候,
那么就会选择要新安装的依赖包放在自己的home目录下,自己的home目录,肯定有权限

同时,也为不同用户授权不同的依赖包功能提供了便利

 
本次迁移没迁移走sklean2pmml的原因就在于,它不知为何安装到home目录下了,
迁移时,忽略了这个目录,
六个目录中,只有home不在安装目录下 

python环境·ubantu

 
sudo apt-get install libffi-dev
sudo apt-get install liblzma-dev


mkdir pyuban
cd soft 
wget https://www.sqlite.org/snapshot/sqlite-snapshot-202404051413.tar.gz --no-check-certificate

sqlite通常被很多程序依赖,但它的安装在python之前
tar -xvf sqlite-snapshot-202404051413.tar.gz
cd sqlite-snapshot-202404051413
./configure -prefix=/data/jupyter/sqliteu
make
make install

sudo apt-get install libsqlite3-dev


wget https://www.openssl.org/source/openssl-3.3.2.tar.gz --no-check-certificate
tar -xvf openssl-3.3.2.tar.gz
cd openssl-3.3.2/
mkdir /data/jupyter/ssl3/
./config shared --prefix=/data/jupyter/ssl3/
sudo make 
sudo make install 

export PYTHON_SSL_DEFAULT=/data/jupyter/ssl3
export SSL_DIR=/data/jupyter/ssl3
export LD_LIBRARY_PATH=/data/jupyter/ssl/lib:$LD_LIBRARY_PATH


wget https://www.python.org/ftp/python/3.11.9/Python-3.11.9.tar.xz
tar -xvf Python-3.11.9.tar.xz
cd Python-3.11.9/
./configure --prefix=/data/jupyter/pyuban/ --enable-optimizations

make 
sudo make install 

sudo chown -R xt.xt /data 
cd /data/jupyter/pyuban/bin
ln -s python3 ./python

$ which python
/data/jupyter/pyuban/bin/python

ln pip3 ./pip

$ which pip
/data/jupyter/pyuban/bin/pip

再次强调一下,一定要确认好pip,python的位置,
若是命令位置不对,出现在其他位置,那么不管是安装,还是使用,都会乱码

 
chown -R xt.xt /data/
su - xt 

vim .bashrc 
export PYTHONPATH=/data/jupyter/pyuban/lib/python3.11/site-packages
export PATH=/data/jupyter/pyuban/bin:$PATH
. ./.bashrc
    
$ python
Python 3.11.9 (main, Apr 28 2024, 14:48:54) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> import ssl
>>> 

重新安装

 
指定SSL,重新编译安装python

export PYTHON_SSL_DEFAULT=/data/jupyter/ssl3
export SSL_DIR=/data/jupyter/ssl3
export LD_LIBRARY_PATH=/data/jupyter/ssl/lib:$LD_LIBRARY_PATH


apt-get install libsqlite3-dev
apt-get install libffi-dev
apt-get install liblzma-dev

wget https://www.python.org/ftp/python/3.11.9/Python-3.11.9.tar.xz
或离线安装
cd /data/jupyter/soft 
tar -xvf Python-3.11.9.tar.xz
cd Python-3.11.9/
./configure --prefix=/data/jupyter/pyuban/ --enable-optimizations

make 
sudo make install 

sudo chown -R xt.xt /data 
cd /data/jupyter/pyuban/bin
ln -s python3 ./python

$ which python
/data/jupyter/pyuban/bin/python

ln pip3 ./pip

$ which pip
/data/jupyter/pyuban/bin/pip

 


 


 
$ find /usr/ -name libffi.so.*
/usr/lib/x86_64-linux-gnu/libffi.so.8
/usr/lib/x86_64-linux-gnu/libffi.so.8.1.4
xt@kl:/data/jupyter/soft$ cd Python-3.11.9
xt@kl:/data/jupyter/soft/Python-3.11.9$ pwd
/data/jupyter/soft/Python-3.11.9


ImportError: libffi.so.7: cannot open shared object file: No such file or directory

---- 4 import pandas as pd

pip install --upgrade pandas -i https://pypi.tuna.tsinghua.edu.cn/simple

重装Pandas不管用,那就重新编译python了

 
mkdir pyffi8
顺便体检一下新版本3.12,因为Python3.11在ubuntu20.04.6上编译出现了大量的warning
wget https://www.python.org/ftp/python/3.12.6/Python-3.12.6.tar.xz
tar -xvf Python-3.12.6.tar.xz
cd /data/jupyter/soft/Python-3.12.6
./configure --prefix=/data/jupyter/pyffi8 --enable-optimizations
make 
sudo make install 

xt@kl:/data/jupyter/pyffi8/bin$ ln -s ./python3.12 ./python
xt@kl:/data/jupyter/pyffi8/bin$ ln -s ./pip3 ./pip

 
export PYTHONPATH=/data/jupyter/pyuban/lib/python3.11/site-packages
export PATH=/data/jupyter/pyffi8/bin:$PATH

PYTHONPATH还是之前的,暂时用一下之前下载的依赖包 

 

  

 

  

 

  

 


python+非root安装

 
cat centos-7-x86_64.tar.gz |docker import - cent7
    
安装用
docker run -itd --name tt1 -h tt1 --net host -v /opt:/opt/  -v /tmp:/tmp -v /mnt:/mnt cent7 bash
docker exec -it tt1 bash

测试用
docker run -itd --name tt2 -h tt2 --net host -v /opt:/opt/  -v /tmp:/tmp -v /mnt:/mnt cent7 bash
docker exec -it tt2 bash

python3编译安装

 
cd app/python/python_install/root_user_install_python_compile_env
rpm   -ivh  *.rpm --nodeps --force   
gcc -dumpversion
4.8.5



adduser ai-aml
su - ai-aml 
cd
mkdir -p app/python/python_project/
mkdir -p app/python/env_python37
cd app/python/python_install/amlai_proc
tar -zxvf  Python-3.7.0.tgz

cd   Python-3.7.0
./configure --prefix=/home/ai-aml/app/python/env_python37
make && make install

    

环境变量配置

 

cd 
mkdir /home/ai-aml/app/python/bin
ln -s /home/ai-aml/app/python/env_python37/bin/python3.7 /home/ai-aml/app/python/bin/python
    
vim .bash_profile
export PATH=/home/ai-aml/app/python/bin:/home/ai-aml/app/python/env_python37/bin:$PATH

. ./.bash_profile

$ which python
~/app/python/bin/python
$ python
Python 3.7.0 (default, Sep 28 2024, 21:44:13)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

python -m pip install --upgrade pip
pip3 install pip-23.1.2-py3-none-any.whl

$ which pip
~/app/python/env_python37/bin/pip
$ which pip3
~/app/python/env_python37/bin/pip3

 
cd app/python/python_install/amlai_proc
pip3 install -r requirements.txt --no-index -f packages
    
pip3 freeze |wc -l
82




    

 

    

安装步骤

 
上传文件ml_20240929.tar.gz到普通用户的home目录

解压到home目录
tar -xvf ml_20240929.tar.gz


系统依赖包安装,以root用户执行
cd ~/app/python/python_install
./ist_rpm.sh 


python安装
cd ~/app/python/python_install/
./ist_python.sh 
source ~/.bash_profile


验证
python test_req.py 
Execution succeeded!
    

 
仅机器学习常用包
是编译后的文件,同时不依赖固定的目录
    

 

    

 

    

 

    

 


 

  

 


miniforge安装

组织,开源,免费

 
https://github.com/conda-forge/miniforge

https://github.com/conda-forge/miniforge/tags

forge  英/fɔːdʒ/  美/fɔːrdʒ/
v.锻造;伪造;制作;假冒;稳步前进;努力加强;艰苦干成
n.铁匠铺;锻造车间;锻造工厂;锻铁炉

 
docker run -itd --privileged --name fg -h fg --net=host -v /tmp:/tmp -v /media:/media -v /media/xt/tpf/tpf:/opt/tpf  cent7  bash

alias fg="docker exec -it fg bash"

yum install -y gcc gcc-c++ kernel-devel

 
mkdir -p /data/jupyter/soft 
cd /data/jupyter/soft 
rsync -rltDv /media/xt/tpf/soft/Mambaforge-23.11.0-0-Linux-x86_64.sh ./
adduser jupyter 
chown -R jupyter.jupyter /data/jupyter/
su - jupyter
cd /data/jupyter/soft
[jupyter@fg soft]$ sh Mambaforge-23.11.0-0-Linux-x86_64.sh 
[/home/jupyter/mambaforge] >>> /data/jupyter/python

cd /data/jupyter/python/bin/
./conda init

[jupyter@fg bin]$ exit
logout
[root@fg /]# su - jupyter



 


 

  

 


jupyter

新加系统命令需要重启jupyter才能在jupyter中找到新加的命令

 
以java为例
jupyter启动时环境中没有java,
后来安装了java,
那么jupyter需要重启才能找到java命令 

重启jupyter内核,
在jupyter中执行source /etc/profile,. ~/.bashrc 等方法都无效 
    

jupyter远程配置


配置python环境变量,告诉jupyter python库目录在哪
PATHHONPATH=/ai/app/anaconda3/lib/python3.9/site-packages

jupyter notebook password #设置 jupyter 的密码
jupyter notebook --generate-config #生成自己配置文件,目录在 ~/.jupyter/jupyter_notebook_config.py

示例
jupyter notebook password
book_1234

$ jupyter notebook --generate-config
vim ~/.jupyter/jupyter_notebook_config.py
c.NotebookApp.ip='*'
c.NotebookApp.open_browser=False
c.NotebookApp.port=8888


 

    

vscode中配置Jupyter插件

 
通常jupyter以web服务的形式运行,但WSL中jupyter却不能简单的以web服务运行起来

这里可以以插件的形式安装到vscode中 

创建一个.ipynb后缀文件就可以

ssl问题

 

xt@ai:/opt/tpf/aiwks/code$ jupyter notebook
Traceback (most recent call last):
    File "/data/jupyter/pyuban/bin/jupyter-notebook", line 5, in 
    from notebook.app import main
    File "/data/jupyter/pyuban/lib/python3.11/site-packages/notebook/app.py", line 12, in 
    from jupyter_server.base.handlers import JupyterHandler
    File "/data/jupyter/pyuban/lib/python3.11/site-packages/jupyter_server/base/handlers.py", line 21, in 
    import prometheus_client
    File "/data/jupyter/pyuban/lib/python3.11/site-packages/prometheus_client/__init__.py", line 3, in 
    from . import (
    File "/data/jupyter/pyuban/lib/python3.11/site-packages/prometheus_client/exposition.py", line 8, in 
    import ssl
    File "/data/jupyter/pyuban/lib/python3.11/ssl.py", line 100, in 
    import _ssl             # if we can't import it, let the error propagate
    ^^^^^^^^^^^
ImportError: libssl.so.1.1: cannot open shared object file: No such file or directory


ImportError: libssl.so.1.1: cannot open shared object file: No such file or directory

 
xt@ai:~$ cd /data/jupyter/
xt@ai:/data/jupyter$ ls
anaconda3  lib  python  pyuban  soft  sqlite3  sqliteu  ssl  ssl3  wks
xt@ai:/data/jupyter$ cd ssl
xt@ai:/data/jupyter/ssl$ ls
bin  include  lib  share  ssl
xt@ai:/data/jupyter/ssl$ cd lib/
xt@ai:/data/jupyter/ssl/lib$ ls
engines-1.1  libcrypto.a  libcrypto.so  libcrypto.so.1.1  libssl.a  libssl.so  libssl.so.1.1  pkgconfig
xt@ai:/data/jupyter/ssl/lib$ cd ../..
xt@ai:/data/jupyter$ cd ssl3/
xt@ai:/data/jupyter/ssl3$ ls
bin  include  lib64  share  ssl
xt@ai:/data/jupyter/ssl3$ ls lib64/
cmake  engines-3  libcrypto.a  libcrypto.so  libcrypto.so.3  libssl.a  libssl.so  libssl.so.3  ossl-modules  pkgconfig


 
缺失的libssl.so.1.1在 /data/jupyter/ssl/lib

添加lib库路径 

export LD_LIBRARY_PATH=/data/jupyter/ssl/lib:$LD_LIBRARY_PATH



The module 'jupytext' could not be found

 
File "/data/jupyter/pyuban/lib/python3.11/site-packages/jupyter_server/extension/manager.py", line 202, in _load_metadata
    raise ExtensionModuleNotFound(msg) from None
jupyter_server.extension.utils.ExtensionModuleNotFound: The module 'jupytext' could not be found (No module named 'jupytext'). Are you sure the extension is installed?
[W 2024-09-09 06:01:21.068 ServerApp] jupyter_tensorboard | error adding extension (enabled: True): The module 'jupyter_tensorboard' could not be found (No module named 'jupyter_tensorboard'). Are you sure the extension is installed?


pip install jupytext

 


 


在jupyter中添加自定义Python模块路径

 
xt@kl:~/.jupyter$ vim jupyter_notebook_config.py

c = get_config()  #noqa

import os
c.NotebookApp.env = {
    'PYTHONPATH': os.pathsep.join([
        '/opt/wks/aitpf/src',  # Linux or macOS
        os.getenv('PYTHONPATH', '')
    ])
}
    

有时可能会有一些空行之类的特殊字符,可以写成一行以减少空白字符

 
import os
c.NotebookApp.env={'PYTHONPATH': os.pathsep.join(['/opt/wks/aitpf/src','/opt/wks/aitpf/fanxq',os.getenv('PYTHONPATH', '')])}

    

 

    

 

    
依赖包安装

 
pip install -r requirements.txt
如果某个包安装失败,请从requirements.txt删除后,再次执行批量安装,安装失败的包,单独安装



ai_yilaipgkinstall
chromadb
pysqlite3
    


 
pip install -U sentence-transformers


 

    

 


 

  

 


机器学习

机器学习工具包

 
安装pandas/sklearn时会自动安装numpy,scipy

网络慢可以指定国内源 
-i https://pypi.tuna.tsinghua.edu.cn/simple

pip install pandas scikit-learn hmmlearn  sklearn_crfsuite chinese_calendar matplotlib  pydotplus  openpyxl  pdfminer.six -i https://pypi.tuna.tsinghua.edu.cn/simple 

pip install pandas
pip install scikit-learn
pip install hmmlearn
pip install sklearn_crfsuite
pip install chinese_calendar
pip install matplotlib
pip install pydotplus
pip install seaborn -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install jieba
pip install preprocess
pip install openpyxl   #pandas读取excel
pip install pdfminer.six

pip install mlxtend -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install shap seaborn -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pyod -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install copulas 

pip install lightgbm 
pip install lightgbm -i https://pypi.tuna.tsinghua.edu.cn/simple 

pip install xgboost
pip install xgboost -i https://pypi.tuna.tsinghua.edu.cn/simple 
pip install xgboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install catboost
pip install catboost -i https://pypi.tuna.tsinghua.edu.cn/simple #安装提速
pip install catboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install sklearn-pandas
pip install --user --upgrade git+https://github.com/jpmml/sklearn2pmml.git
或
pip install sklearn2pmml -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install featuretools -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install feature-engine -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install tsfresh -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install requests
pip install Flask
pip install gevent 
pip install Flask-APScheduler

pip install cx_Oracle 
pip install pymysql==1.0.2
pip install sqlalchemy

pip install python-dotenv

pip install jupyter 
pip install jupyter_contrib_nbextensions -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install --user  -i https://pypi.tuna.tsinghua.edu.cn/simple py2neo

pip3 install cryptography -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install backports.lzma -i https://pypi.tuna.tsinghua.edu.cn/simple

https://zhuanlan.zhihu.com/p/366952043

 
pip install statsmodels
https://blog.csdn.net/GodFatherMisZhao/article/details/136339482


 
pip install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple  #pip要求系统具有SSL

pip install jupyter jupyter_contrib_nbextensions -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install pandas sklearn-pandas scikit-learn hmmlearn  sklearn_crfsuite chinese_calendar matplotlib  pydotplus  openpyxl  pdfminer.six -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install lightgbm catboost xgboost statsmodels -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install featuretools feature-engine tsfresh mlxtend shap seaborn pyod copulas cx_Oracle pymysql==1.0.2 sqlalchemy -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install torch torchvision torchaudio onnx onnxruntime -i https://pypi.tuna.tsinghua.edu.cn/simple

    

 

    

常用镜像

 
清华大学 https://pypi.tuna.tsinghua.edu.cn/simple/
阿里云 http://mirrors.aliyun.com/pypi/simple/
中国科技大学 https://pypi.mirrors.ustc.edu.cn/simple/
豆瓣(douban) http://pypi.douban.com/simple/
中国科学技术大学 http://pypi.mirrors.ustc.edu.cn/simple/

机器学习工具包

 
安装pandas/sklearn时会自动安装numpy,scipy

网络慢可以指定国内源 
-i https://pypi.tuna.tsinghua.edu.cn/simple

pip install pandas -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install scikit-learn -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install hmmlearn -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install sklearn_crfsuite -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install chinese_calendar -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install matplotlib -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pydotplus -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install jieba -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install preprocess -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install openpyxl  -i https://pypi.tuna.tsinghua.edu.cn/simple #pandas读取excel
pip install pdfminer.six -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install lightgbm 
pip install lightgbm -i https://pypi.tuna.tsinghua.edu.cn/simple 

pip install xgboost
pip install xgboost -i https://pypi.tuna.tsinghua.edu.cn/simple 
pip install xgboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install catboost
pip install catboost -i https://pypi.tuna.tsinghua.edu.cn/simple #安装提速
pip install catboost -i https://pypi.douban.com/simple --trusted-host pypi.douban.com

pip install sklearn-pandas -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install --user --upgrade git+https://github.com/jpmml/sklearn2pmml.git

pip install requests -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install Flask -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install gevent  -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install Flask-APScheduler -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install cx_Oracle  -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pymysql==1.0.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install sqlalchemy -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install jupyter -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install --user  -i https://pypi.tuna.tsinghua.edu.cn/simple py2neo

pip3 install cryptography -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install jupyter_contrib_nbextensions -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install jupyter_nbextensions_configurator -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install backports.lzma -i https://pypi.tuna.tsinghua.edu.cn/simple

https://zhuanlan.zhihu.com/p/366952043

 
pip install statsmodels -i https://pypi.tuna.tsinghua.edu.cn/simple
https://blog.csdn.net/GodFatherMisZhao/article/details/136339482


pytorch cpu

pip install

 
pip install torch torchvision torchaudio
    
pip install onnx onnxruntime

conda install


conda install pytorch torchvision torchaudio cpuonly -c pytorch

torch全家桶

 
pytorch,tensorflow是大框架,这些框架通常不安装在一个环境中,
如果一个环境中安装了pytorch,通常就不安装tensorflow了,
tensorflow再起一个新环境进行安装,
因为它们放一起可能会出现一些异常,
这就有了pytorch环境,并不是说这个环境中只有pytorch,
面试问你常用哪些框架,你答pytorch,也不是说你没用过sklearn

https://pytorch.org/get-started/locally/
conda install pytorch torchvision torchaudio cpuonly -c pytorch
或
pip install torch torchvision torchaudio
>>> import torch
>>> torch.__version__
'1.13.1+cpu'

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.0+cpu.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.13.1+cpu.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.13.1+cpu.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.13.1+cpu.html

>>> import torch
>>> torch.__version__
'1.13.1+cu117'
2.2.2+cu121

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.13.1+cu117.html

>>> import torch
>>> torch.__version__
'2.0.0'

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.0+cpu.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.0.0+cpu.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.0.0+cpu.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.0.0+cpu.html

2.2.2+cu121
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.2.2+cu121.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.2.2+cu121.html

>>> import torch
>>> torch.__version__
'2.5.1+cu124'

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.5.1+cu124.html


>>> import torch
>>> torch.__version__
'2.6.0+cu124'

针对2.6.0版本,下面的安装不成功,但采用2.5.1版本的torch,就可以成功

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.6.0+cu124.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.6.0+cu124.html
pip install torch-geometric
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.6.0+cu124.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-2.6.0+cu124.html



深度学习部署

 
pip install onnx 
pip install onnxruntime

conda常用语法

 
conda create --name py39 python=3.9
conda create --name py36 python=3.6

conda activate py36
conda deactivate
conda remove -n py36 --all
AI依赖包安装问题
    
import pandas 
ModuleNotFoundError: No module named '_bz2'
-------------------------------------------------------------
这个问题,看着简单,但一上来怎么都解决不了... 

_bz2是压缩功能相关包,它来自系统,
根源在于安装pandas时,如果系统缺少bz压缩依赖包,pandas还能照常安装,不报错,
在import pandas时给你来一个 
File "/opt/app/python3/lib/python3.9/site-packages/pandas/io/common.py", line 8, in 
    import bz2
    File "/opt/app/python3/lib/python3.9/bz2.py", line 18, in 
    from _bz2 import BZ2Compressor, BZ2Decompressor
ModuleNotFoundError: No module named '_bz2'

实际上这个问题属于python的安装问题,在python的编辑阶段,
安装python后发现少了依赖包,就需要在OS上安装这个缺少的包,然后重新编辑python,
有人会把别的地方编辑好的库文件COPY过来,但有时这样不生效,
pandas可能认为 python编辑的事 自己管不到,你再编辑一下就好... 

解决方法,先在OS上安装 
yum install bzip2-devel     # linux 
apt-get install libbz2-dev  # ubantu 

备份一下原来的python目录,其实主要是备份那些你曾经安装过的库文件 
cp -r python3/ python39

然后重新编辑python 
tar -xvf Python-3.9.15.tar.xz 
cd Python-3.9.15/
vim setup.py # 这一步是把sqlite这个提前依赖的库文件路径加载进去,非必须,但最好安装一下
./configure --prefix=/opt/app/python3
make 
sudo make install

这次安装完之后,就多了一个_bz2.so文件
$ pwd
/opt/app/python3/lib/python3.9/lib-dynload
$ ll _bz*
-rwxr-xr-x 1 root root 63680  1月 30 15:32 _bz2.cpython-39-x86_64-linux-gnu.so*

然后再把原来的依赖包COPY过来,
这样省事地COPY安装后的依赖包没问题吗?
我最近一直这么用,线上不联网,我就是这么处理的,
目前没遇到问题,如果遇到了,我会在这个专栏上记录什么依赖包COPY会出错 
目前看,AI这一堆依赖包没啥问题 
rsync -avP python39/lib/python3.9/site-packages/* python3/lib/python3.9/site-packages/

xt@xt:/opt/app$ python
Python 3.9.15 (main, Jan 30 2023, 15:31:27) 
[GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> 
python调用java

python调用java依赖包安装

 
https://pypi.org/project/JPype1/#files
https://files.pythonhosted.org/packages/57/4f/3cddc9b9cd892bbe098e5d48ed3a8aaa02dd3fa732612065fa6b0fab0062/JPype1-1.3.0.tar.gz

tar -xvf JPype1-1.3.0.tar.gz
python setup.py install

GPU相关

 
C:\Users\83933>nvidia-smi
    Sun Apr 28 10:22:24 2024
    +---------------------------------------------------------------------------------------+
    | NVIDIA-SMI 537.53                 Driver Version: 537.53       CUDA Version: 12.2     |
    |-----------------------------------------+----------------------+----------------------+
    | GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
    |                                         |                      |               MIG M. |
    |=========================================+======================+======================|
    |   0  NVIDIA GeForce RTX 4070 ...  WDDM  | 00000000:01:00.0 Off |                  N/A |
    | N/A    0C    P0              N/A /  80W |      0MiB /  8188MiB |      0%      Default |
    |                                         |                      |                  N/A |
    +-----------------------------------------+----------------------+----------------------+
    
    +---------------------------------------------------------------------------------------+
    | Processes:                                                                            |
    |  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
    |        ID   ID                                                             Usage      |
    |=======================================================================================|
    |  No running processes found                                                           |
    +---------------------------------------------------------------------------------------+

    

pytorch.org

 
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

CUDA Version: 12.2
但官方没有12.2这个版本,只能安装cu121,也能用 
    

 
>>> import torch
>>>
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name()
'NVIDIA GeForce RTX 4070 Laptop GPU'
>>> torch.cuda.get_device_name(0)
'NVIDIA GeForce RTX 4070 Laptop GPU'
>>>
  

数据与模型皆要在同一device上运行

 
device = torch.device("cuda" if USE_CUDA else "cpu")

encoder = encoder.to(device)
decoder = decoder.to(device)

loss = loss.to(device)
input_variable = input_variable.to(device)
target_variable = target_variable.to(device)
mask = mask.to(device)

decoder_input = torch.ones(1, 1, device=device, dtype=torch.long) * SOS_token
# Initialize tensors to append decoded words to
all_tokens = torch.zeros([0], device=device, dtype=torch.long)
all_scores = torch.zeros([0], device=device)
    

简单说几句GPU

 
买电脑/服务器时
GPU有多家厂商,但不是所有的品牌都可以用于人工智能深度学习训练 
是否能安装英伟达驱动,可以到英伟达官网看一下 
NVIDIA Driver Downloads

GPU的好处不说了,这里说说不足:
1. 贵,有钱的请忽略此处,后面的也不用看了,一分钱一分货,选贵的就行了 
2. 噪声,一跑起来那风扇声音不是一般人能忍受的,在办公室里吸引他人眼光,放家里影响休息
3. 性能,深度学习至少是台式机,并且是水冷,这是笔记本没法比的
4. 时长,一次模型训练有时几星期甚至一两月,若在笔记本上跑,还干不干别的了
5. 经常用吗?买时痛快,但买了之后经常用的人又有几个?
6. 本人入门AI也有几个年头了,用的还是之前的旧电脑,需要加速就去网上租GPU,也纠结过,但坚持没买,主要是没钱

基于以上几点,初学者建议先租GPU,或者说入门的前两年,不建议买GPU

参考
    LightGBM 详细讲解
    第11步 CatBoost