不同版本下载 https://www.elastic.co/guide/en/elasticsearch/reference/6.5/es-release-notes.html 【本次安装参考】 http://blog.51cto.com/moerjinrong/2310817 分词安装要求分词的插件的版本与ES版本号完全一致,因此要先看一下分词的版本与ES的版本 本次安装为v6.5.0,es、ik、head https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.tar.gz https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.0/elasticsearch-analysis-ik-6.5.0.zip https://github.com/mobz/elasticsearch-head/archive/v5.0.0.tar.gz 官方文档 https://www.elastic.co/guide/index.html https://www.elastic.co/guide/en/elasticsearch/reference/6.5/release-notes-6.5.0.html ES7 Downloads: https://elastic.co/downloads/elasticsearch Release notes: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/release-notes-7.17.20.html 历史版本下载 https://www.elastic.co/downloads/past-releases#elasticsearch https://www.elastic.co/downloads/past-releases/elasticsearch-7-17-20 https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.20-linux-x86_64.tar.gz 下载 依赖JDK wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.tar.gz --no-check-certificate wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.0/elasticsearch-analysis-ik-6.5.0.zip --no-check-certificate wget https://github.com/mobz/elasticsearch-head/archive/v5.0.0.tar.gz --no-check-certificate wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.20-linux-x86_64.tar.gz --no-check-certificate |
三节点集群安装 - ES是以集群方式运行的,至少需要两个节点 - 不需要配置SSH - 依赖JDK 为docker划分一个子网段,仅限于该服务器内使用 docker network rm mydk docker network create --subnet=192.168.73.0/24 mydk docker run -itd --privileged --name es1 -h es1 --net mydk --ip 192.168.73.11 -v /opt:/opt -v /tmp:/tmp -v /mnt:/mnt -v /media:/media -p 13301:13301 cent7 /usr/sbin/init docker exec -it es1 bash ### 依赖安装 yum install -y net-tools libaio numactl yum -y install gcc gcc-c++ autoconf make yum install openssl-devel bzip2-devel docker run -itd --privileged --name es2 -h es2 --net mydk --ip 192.168.73.12 -v /opt:/opt -v /tmp:/tmp -v /mnt:/mnt -v /media:/media -p 13301:13301 cent7 /usr/sbin/init docker exec -it es2 bash docker run -itd --privileged --name es3 -h es3 --net mydk --ip 192.168.73.13 -v /opt:/opt -v /tmp:/tmp -v /mnt:/mnt -v /media:/media cent7 /usr/sbin/init docker exec -it es3 bash JDK export JAVA_HOME=/opt/app/jdk-11 export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$CLASSPATH export PATH=$JAVA_HOME/bin:$PATH |
节点1 mkdir -p /data/es/{app,data,logs} rsync -rltDv /media/xt/tpf/soft/es/ /data/es/app/ cd /data/es/app/ tar -zxvf elasticsearch-6.5.0.tar.gz mkdir /data/es/app/elasticsearch-6.5.0/plugins/ik unzip elasticsearch-analysis-ik-6.5.0.zip -d /data/es/app/elasticsearch-6.5.0/plugins/ik ls /data/es/app/elasticsearch-6.5.0/plugins/ik commons-codec-1.9.jar config httpclient-4.5.2.jar plugin-descriptor.properties commons-logging-1.2.jar elasticsearch-analysis-ik-6.5.0.jar httpcore-4.4.4.jar plugin-security.policy |
echo " xt soft nofile 655350 xt hard nofile 655350 xt soft nproc 655350 xt hard nproc 655350 xt soft memlock -1 xt hard memlock -1 " >> /etc/security/limits.conf cat /etc/security/limits.conf ll /etc/security/limits.d/20-nproc.conf echo " xt soft nproc 655350 ">> /etc/security/limits.d/20-nproc.conf cat /etc/security/limits.d/20-nproc.conf echo " xt soft nproc 655350 ">> /etc/security/limits.d/90-nproc.conf cat /etc/security/limits.d/90-nproc.conf echo " vm.max_map_count=262144 ">> /etc/sysctl.conf sysctl -p |
本次在docker中安装,ssh通信失败,但es通信成功,原因未知。 各个节点执行 yum install openssh-server adduser xt su - xt ssh-keygen -t rsa 各个节点执行除本节点外的两个两个命令 ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.73.11 ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.73.12 ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.73.13 如果采用一机安装三个节点,就不需要配置互信了 猜测原因 对于ES来说,通信主要使用http端口,没有使用ssh服务, 因此不需要配置互信,可以解决通信问题。 |
一个节点配置好后,再复制到其他节点 vim /data/es/app/elasticsearch-6.5.0/config/elasticsearch.yml cluster.name: my-application node.name: node-1 path.data: /data/es/data/ path.logs: /data/es/logs/ network.host: 192.168.73.11 http.port: 9200 discovery.zen.ping.unicast.hosts: ["192.168.73.11", "192.168.73.12","192.168.73.13"] discovery.zen.minimum_master_nodes: 2 将文件复制到其他节点 mkdir -p /data/es/{app,data,logs} scp -r xt@192.168.73.11:/data/es/app/elasticsearch-6.5.0 /data/es/app chown -R xt.xt /data/es 其他节点对 elasticsearch.yml 修改如下 vim /data/es/app/elasticsearch-6.5.0/config/elasticsearch.yml node.name: node-2 network.host: 192.168.73.12 rsync -rltDv /data/es/app/elasticsearch-6.5.0 /tmp/ mkdir -p /data/es/{app,data,logs} chown -R xt.xt /data/es rsync -rltDv /tmp/elasticsearch-6.5.0 /data/es/app/ chown -R xt.xt /data/es |
http://192.168.73.11:9100 【后台启动】 cd /data/es/app/elasticsearch-6.5.0 nohup ./bin/elasticsearch > /data/es/logs/start.log 2>&1 & tailf /data/es/logs/start.log 或 ./bin/elasticsearch -d 第一个节点启动时会报以下信息,第二个节点启动后就好了 not enough master nodes discovered during 第二个节点启动后会有加入集群的信息,第三个节后则没有该信息,因为此配置文件中主节点个数为2 [node-2] recovered [0] indices into cluster_state 【关闭】 使用启动用户杀即可 ps -ef |grep ela kill -9 进程号 |
在浏览器中访问 http://192.168.73.11:9200/ 创建一个索引 curl -XPUT http://192.168.73.11:9200/index {"acknowledged":true,"shards_acknowledged":true,"index":"index"} 创建一个映射 curl -XPOST http://192.168.73.11:9200/index/fulltext/_mapping -H 'Content-Type:application/json' -d' { "properties": { "content": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word" } } }' {"acknowledged":true} 索引一些文档 curl -XPOST http://192.168.73.11:9200/index/fulltext/1 -H 'Content-Type:application/json' -d' {"content":"时间是一切财富中最宝贵的财富"} ' curl -XPOST http://192.168.73.11:9200/index/fulltext/2 -H 'Content-Type:application/json' -d' {"content":"世界上一成不变的东西,只有“任何事物都是在不断变化的”这条真理。"} ' curl -XPOST http://192.168.73.11:9200/index/fulltext/3 -H 'Content-Type:application/json' -d' {"content":"要使别人喜欢你,首先你得改变对人的态度,把精神放得轻松一点,表情自然,笑容可掬,这样别人就会对你产生喜爱的感觉了。——卡耐基"} ' curl -XPOST http://192.168.73.11:9200/index/fulltext/4 -H 'Content-Type:application/json' -d' {"content":"君子在下位则多谤,在上位则多誉;小人在下位则多誉,在上位则多谤。——柳宗元"} ' curl -XPOST http://192.168.73.11:9200/index/fulltext/5 -H 'Content-Type:application/json' -d' {"content":"一个不注意小事情的人,永远不会成功大事业。——卡耐基"} ' {"_index":"index","_type":"fulltext","_id":"5","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":3} 查看 curl -XPOST http://192.168.73.11:9200/index/fulltext/_search?pretty -H 'Content-Type:application/json' -d' { "query" : { "match" : { "content" : "卡耐基" }}, "highlight" : { "pre_tags" : ["<tag1>", "<tag2>"], "post_tags" : ["</tag1>", "</tag2>"], "fields" : { "content" : {} } } } ' 查询显示 { "took" : 307, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.2876821, "hits" : [ { "_index" : "index", "_type" : "fulltext", "_id" : "5", "_score" : 0.2876821, "_source" : { "content" : "一个不注意小事情的人,永远不会成功大事业。——卡耐基" }, "highlight" : { "content" : [ "—— |
es7可以单节点安装 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.4-linux-x86_64.tar.gz --no-check-certificate wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.4/elasticsearch-analysis-ik-7.17.4.zip --no-check-certificate wget https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v7.17.4/elasticsearch-analysis-pinyin-7.17.4.zip
扩展阅读:
|
安装前面安装的es1,es2,es3 su - xt cd /data/es/app rsync -rltDv /tmp/es7/elasticsearch-7.17.4-linux-x86_64.tar.gz ./ tar -xvf elasticsearch-7.17.4-linux-x86_64.tar.gz discovery.seed_hosts: 集群主机列表 cluster.initial_master_nodes: 启动时初始化的参与选主的node,生产环境必填 vim elasticsearch-7.17.4/config/elasticsearch.yml cluster.name: my-application node.name: node-1 path.data: /data/es/data/ path.logs: /data/es/logs network.host: 192.168.73.11 http.port: 9200 discovery.seed_hosts: ["192.168.73.11", "192.168.73.12"] cluster.initial_master_nodes: ["node-1", "node-2"] ./bin/elasticsearch -d https://www.cnblogs.com/Likfees/p/16449224.html 分词器 cd elasticsearch-7.17.4/plugins/ rsync -rltDv /tmp/es7/elasticsearch-analysis-ik-7.17.4.zip ./ unzip -d ik elasticsearch-analysis-ik-7.17.4.zip rm elasticsearch-analysis-ik-7.17.4.zip wget https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v7.17.4/elasticsearch-analysis-pinyin-7.17.4.zip rsync -rltDv /tmp/es7/elasticsearch-analysis-pinyin-7.17.4.zip ./ unzip -d pinyin elasticsearch-analysis-pinyin-7.17.4.zip rm elasticsearch-analysis-pinyin-7.17.4.zip |
rsync -rltDv /data/es/app/elasticsearch-7.17.4 /tmp/ rsync -rltDv /tmp/elasticsearch-7.17.4 /data/es/app/ vim elasticsearch-7.17.4/config/elasticsearch.yml cluster.name: my-application node.name: node-2 path.data: /data/es/data/ path.logs: /data/es/logs network.host: 192.168.73.12 http.port: 9200 discovery.seed_hosts: ["192.168.73.11", "192.168.73.12"] cluster.initial_master_nodes: ["node-1", "node-2"] ./bin/elasticsearch -d [xt@es2 elasticsearch-7.17.4]$ netstat -tunlp (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 192.168.73.12:9300 0.0.0.0:* LISTEN 213/java tcp 0 0 192.168.73.12:9200 0.0.0.0:* LISTEN 213/java tcp 0 0 127.0.0.11:43867 0.0.0.0:* LISTEN - udp 0 0 127.0.0.11:47013 0.0.0.0:* - [xt@es2 elasticsearch-7.17.4]$ 本次没有按下面的配置进行,集群依然起来,怀疑discovery.zen.ping.unicast.hosts是es6中的配置,es7中不需要了 cluster.name: kkb-es node.name: node-0 node.master: true network.host: 0.0.0.0 http.port: 9200 transport.tcp.port: 9300 # tcp 端口 discovery.zen.ping.unicast.hosts: ["192.168.147.66:9300","192.168.147.67:9300","192.168.147.68:9300"] discovery.zen.minimum_master_nodes: 2 http.cors.enabled: true http.cors.allow-origin: "*" |
后续安装直接解压,然后修改配置文件即可 tar -zcvf elasticsearch-7.17.4_ok.tar.gz elasticsearch-7.17.4/ mv elasticsearch-7.17.4_ok.tar.gz /media/xt/tpf/soft/es7/ JDK export JAVA_HOME=/opt/app/jdk-11 export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$CLASSPATH export PATH=$JAVA_HOME/bin:$PATH 系统配置 解压安装 mkdir /data/es cd /data/es rsync -rltDv /media/xt/tpf/soft/es7/elasticsearch-7.17.4_ok.tar.gz ./ tar -xvf elasticsearch-7.17.4_ok.tar.gz 配置 mkdir -p /data/es/data/ mkdir -p /data/es/logs/ 单节点配置 vim config/elasticsearch.yml network.host: 127.0.0.1 discovery.seed_hosts: ["127.0.0.1"] cluster.initial_master_nodes: ["node-1"] 启动 ./bin/elasticsearch -d |
cat config/elasticsearch.yml cluster.name: my-application node.name: node-1 path.data: /data/es/data/ path.logs: /data/es/logs network.host: 127.0.0.1 http.port: 9200 transport.tcp.port: 9300 discovery.seed_hosts: ["127.0.0.1"] cluster.initial_master_nodes: ["node-1"] 如果IP配置为127.0.0.1就只能本地访问 如果想要外部访问,就必须配置具体的IP - 比如windows中的ubantu系统, - 要想在windows中访问ubantu中的es,那么es配置的IP就必须写对外的IP,比如 172.31.150.83 |
pip install elasticsearch6 创建一个索引 curl -XPUT http://192.168.73.11:9200/index 创建一个映射 curl -XPOST http://192.168.73.11:9200/index/fulltext/_mapping -H 'Content-Type:application/json' -d' { "properties": { "content": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word" } } }' 索引一些文档 curl -XPOST http://192.168.73.11:9200/index/fulltext/1 -H 'Content-Type:application/json' -d' {"content":"时间是一切财富中最宝贵的财富"} ' curl -XPOST http://192.168.73.11:9200/index/fulltext/2 -H 'Content-Type:application/json' -d' {"content":"世界上一成不变的东西,只有“任何事物都是在不断变化的”这条真理。"} ' curl -XPOST http://192.168.73.11:9200/index/fulltext/3 -H 'Content-Type:application/json' -d' {"content":"要使别人喜欢你,首先你得改变对人的态度,把精神放得轻松一点,表情自然,笑容可掬,这样别人就会对你产生喜爱的感觉了。——卡耐基"} ' python检索 es = Elasticsearch('http://192.168.73.11:9200') # 索引名称 index_name = 'index' # 执行一个简单的搜索请求 response = es.search( index=index_name, body={ "query": { "match_all": {} } } ) # 打印搜索结果 print(response['hits']['hits']) # 关闭与Elasticsearch的连接 # es.close() 插入一个索引 from elasticsearch6 import Elasticsearch import datetime # 初始化Elasticsearch客户端 es = Elasticsearch([{'host': '192.168.73.11', 'port': 9200}]) # 创建索引 index_name = "index2" if not es.indices.exists(index=index_name): es.indices.create(index=index_name) # 插入数据 doc_id = "2" doc_body = {"name": "张三", "age": 30, "email": "aaazhnag@example.com", "created_at": datetime.datetime.utcnow()} response = es.index(index=index_name, id=doc_id, body=doc_body,doc_type="_doc") # 输出响应 print(response) |
pip install elasticsearch7 from elasticsearch7 import Elasticsearch, helpers # 1. 创建Elasticsearch连接 es = Elasticsearch( hosts=['http://127.0.0.1:9200'], # 服务地址与端口 http_auth=("elastic", "aaa"), # 用户名,密码 ) # 2. 定义索引名称 index_name = "index" # 3. 如果索引已存在,删除它(仅供演示,实际应用时不需要这步) if es.indices.exists(index=index_name): es.indices.delete(index=index_name) # 4. 创建索引 es.indices.create(index=index_name) # 5. 灌库指令 actions = [ { "_index": index_name, "_source": { "keywords": to_keywords(para), "text": para } } for para in [ "今天天气不错",] ] # 6. 文本灌库 helpers.bulk(es, actions) from elasticsearch import Elasticsearch, Requirements requirements = Requirements( [Requirements.XpackSecurity if (es.info['security']['version'].startswith('7.') or es.info['security']['version'].startswith('8.')) else 'none'] ) es = Elasticsearch( 'https://localhost:9200', basic_auth=('user', 'passwd'), requirements=requirements, verify_certificates=False, # 如果不想验证SSL证书,可以设置为False ) |
from elasticsearch7 import Elasticsearch, helpers from nltk.stem import PorterStemmer from nltk.tokenize import word_tokenize from nltk.corpus import stopwords import nltk import re import warnings warnings.simplefilter("ignore") # 屏蔽 ES 的一些Warnings def to_keywords(input_string): '''(英文)文本只保留关键字''' # 使用正则表达式替换所有非字母数字的字符为空格 no_symbols = re.sub(r'[^a-zA-Z0-9\s]', ' ', input_string) word_tokens = word_tokenize(no_symbols) # 加载停用词表 stop_words = set(stopwords.words('english')) ps = PorterStemmer() # 去停用词,取词根 filtered_sentence = [ps.stem(w) for w in word_tokens if not w.lower() in stop_words] return ' '.join(filtered_sentence) # 1. 创建Elasticsearch连接 es = Elasticsearch( hosts=['http://192.168.73.11:9200'], # 服务地址与端口 verify_certificates=False # http_auth=("elastic", "aaa"), # 用户名,密码 ) # 2. 定义索引名称 index_name = "index" # 3. 如果索引已存在,删除它(仅供演示,实际应用时不需要这步) if es.indices.exists(index=index_name): es.indices.delete(index=index_name) # 4. 创建索引 es.indices.create(index=index_name) # 5. 灌库指令 actions = [ { "_index": index_name, "_source": { "keywords": to_keywords(para), "text": para } } for para in [ "今天天气不错",] ] # 6. 文本灌库 helpers.bulk(es, actions) |
|
|
sklearn2pmml github PMML讲解及使用