Chef cookbook to Manage Apache SolrCloud
This is a Chef cookbook for Apache Solr.
It was primarily developed for Testing SolrCloud against Solr Master/Slave setup and its features.
Currently it supports only in built Jetty based SolrCloud deployment, more features and attributes will be added over time, feel free to contribute what you find missing!
SolrCloud is the default deployment and Solr Master/Slave setup is not supported by this cookbook.
https://github.com/vkhatri/chef-solrcloud
This cookbook was tested for Apache Solr v4.9, v4.10 and v5.1.0.
Currently this cookbook supports only Apache Solr in built Jetty based deployment.
Currently this cookbook only supports Apache Solr Tarball based deployment.
Currently this cookbook only support SolrCloud Cluster deployment. It does not support Apache Solr Master/Slave Cluster deployment.
Check Apache Solr Documentation for JDK Version requirement for current Solr version, Oracle JDK 7 is recommended.
zkconfigset
provider now refers to resource attribute configset_name
instead of name
attribute.
Existing users might want to test LWRP zkconfigset
resources before using this version.solrcloud::tarball
- install solr package, directories and service
solrcloud::config
- manages solr base configuration files
solrcloud::jetty
- manages jetty base configuration files and directories
solrcloud::zkcli
- setup zookeeper package for zookeeper client binary (zkCli.sh)
zkcli recipe does not manage zookeeper server and its only purpose
is to have zookeeper client on all solr nodes
solrcloud::user
- create solr service user
solr user is better to be managed by a User management cookbook
instead of solrcloud for Production environment.
solrcloud::zkconfigsets
- create/delete solrcloud configSet in zookeeper via LWRP
solrcloud::collections
- create/delete solrcloud collection on solrcloud node via LWRP
solrcloud::tarball
is the main recipe which includes all other recipe. Forrun_list
usesolrcloud::tarball
.
LWRP - solrcloud_zkconfigset
SolrCloud Zookeeper configSet is managed via LWRP - solrcloud_zkconfigset
.
SolrCloud Zookeeper configSets management is enabled by default for all nodes.
It means all nodes will get the configSets and will try to manage it against
one of the configured zookeeper server via attribute node[:solrcloud][:solr_config][:solrcloud][:zk_host]
.
All the nodes communicate to a zookeeper cluster, hence attribute
`node[:solrcloud][:manage_zkconfigsets]` & `node[:solrcloud][:manage_zkconfigsets_source]`
does not require to be enabled on all the nodes.
Check Cookbook Advanced Attributes
section for attribute details.
zookeeper configSet config changes
LWRP handles config changes by itself. When any change is made to configSet content, configSet will re-upload configSet to zookeeper.
LWRP example
Create a configSet using LWRP:
solrcloud_zkconfigset configset_name
option option_name
end
Always re create/upload configSet even exists or no update to config files:
solrcloud_zkconfigset configset_name
force_upload true
option option_name
end
OR
Set attribute node[:solrcloud][:force_zkconfigsets_upload]
to true, which affects all the configSets as
resource attribute :force_upload defualt value is set to node[:solrcloud][:force_zkconfigsets_upload]
.
Delete a configSet using LWRP:
solrcloud_zkconfigset configset_name do
action :delete
end
configSet via node attribute:
"default_attributes": {
"solrcloud": {
"zkconfigsets": {
"abc": {
"action": "delete"
},
"xyz": {
"option name": "option value"
}
}
}
}
configSets can either be configured in recipe using LWRP or using node attribute
node[:solrcloud][:zkconfigsets]
.configSets defined using attribute
node[:solrcloud][:zkconfigsets]
does not require LWRP.
LWRP Options
SolrCloud Zookeeper cmd Reference: https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities
Parameters:
node[:solrcloud][:user]
node[:solrcloud][:group]
node[:solrcloud][:zookeeper][:solr_zkcli]
node[:solrcloud][:zookeeper][:zkcli]
node[:solrcloud][:solr_config][:solrcloud][:zk_host].first
node[:solrcloud][:zkconfigsets_home]
node[:solrcloud][:zkconfigsets_cookbook]
node[:solrcloud][:manage_zkconfigsets]
node[:solrcloud][:force_zkconfigsets_upload]
LWRP configSet source cookbook/location management
All configSet content must be stored under node[:solrcloud][:zkconfigsets_cookbook]
/files/default/config set name/conf/` if
not managed separately.
configSets source cookbook is default set to solrcloud
and can be changed via attribute node[:solrcloud][:zkconfigsets_cookbook]
.
If configSets are managed outside of the cookbook, configSet will only get uploaded in case it is missing in the zookeeper.
Any update to separately managed configSets are not propogated to zookeeper by default. However, one can use attribute
node[:solrcloud][:force_zkconfigsets_upload]
to always upload the configSet regardless of the state.
Setting attribute node[:solrcloud][:force_zkconfigsets_upload] or resource attribute :force_upload would always trigger configSet upload to zookeeper. It is better not to enable rsource attribute :force_upload, but instead better to use attribute node[:solrcloud][:force_zkconfigsets_upload] on limited set of nodes.
This may vary environment to environment.
LWRP - solrcloud_collection
SolrCloud collection is managed via LWRP - solrcloud_collection
.
Create/Delete Collection API does not require to run on all solrcloud cluster nodes, hence attribute
node[:solrcloud][:manage_collections]
does not require to be enabled on all the nodes.
Check Cookbook Advanced Attributes
section for attribute details.
collection Update/Change
collection LWRP only perform collection action=CREATE|DELETE|RELOAD and does not manage any UPDATE/change in the collection.
To make a change to a collection, first make the change in the LWRP or node attribute node[:solrcloud][:collections][:collection_nam][:attribute_name]
for respective attribute.
Once changes are made in Chef cookbook, perform collection UPDATE or respective action call to one of the solrcloud node.
UPDATE call could be tricky and is not managed by Chef to avoid any unexpected behavior.
Re-issuing same command could hinder solrcloud cluster setup and must be re-issued carefully.
LWRP example
Create a collection using LWRP:
solrcloud_collection collection_name
option option_name
end
Delete a collection using LWRP:
solrcloud_collection collection_name do
action :delete
end
Reload a collection using LWRP:
solrcloud_collection collection_name do
action :reload
end
collection via node attribute:
"default_attributes": {
"solrcloud": {
"collections": {
"abc": {
"action": "delete"
},
"def": {
"action": "reload"
},
"xyz": {
"num_shards": "1",
"name": "xyz",
"replication_factor": "1",
"collection_config_name": "xyz",
"option name": "value"
}
}
}
}
collections can either be configured in recipe using LWRP or using node attribute
node[:solrcloud][:collections]
.collections defined using attribute
node[:solrcloud][:collections]
does not require LWRP.
LWRP Options
Collection API Reference: https://cwiki.apache.org/confluence/display/solr/Collections+API
Parameters:
node['solrcloud']['jetty_config']['context']['path']
node[:ipaddress]
node[:solrcloud][:port]
node[:solrcloud][:ssl_port]
default[:solrcloud][:manage_zkconfigsets]
(default: true
): manages solrcloud configSets in zookeeper
This attribute should be enabled for limited nodes in solrcloud cluster if possible.
default[:solrcloud][:manage_zkconfigsets_source]
(default: true
): manages solrcloud collections configSets source content directory
This attribute should be enabled for limited nodes in solrcloud cluster if possible.
default[:solrcloud][:notify_zkconfigsets_upload]
(default: true
): notify/triggers configSet upload to zookeeper upon create/update
This attribute should be enabled for limited nodes in solrcloud cluster if possible.
default[:solrcloud][:manage_collections]
(default: true
): if set true, manages solrcloud cluster collections
This attribute should be enabled for limited nodes in solrcloud cluster if possible.
default[:solrcloud][:notify_restart]
(default: false
): notify solr service restart on a solrcloud resource change like config file/template etc.
default[:solrcloud][:notify_restart_upgrade]
(default: false
): notify solr service restart on version upgrade
default[:solrcloud][:restore_cores]
(default: true
): restore older version solr cores configuration to newer version
Note: Disable this option if reverting back to an old version. Before restoring the cores, new version cores directory content gets purged.
If there are changes in cores configuration between older and newer versions, only current(older) version cores configuration will presists.
default[:solrcloud][:zk_run]
(default: false
): if true solr will start up with embedded zookeeper
Note: Setting option `node[:solrcloud][:zk_run]` will remove solrcloud config zk_host from solr.xml, mainly meant for testing purpose
default[:solrcloud][:enable_jmx]
(default: true
): enable jmx
default[:solrcloud][:port]
(default: 8983
): solr service port, must be (>1024 for non root user)
default[:solrcloud][:ssl_port]
(default: 8984
): solr ssl service port (>1024 for non root user)
default[:solrcloud][:enable_ssl]
(default: true
): enable solr ssl connector
default[:solrcloud][:enable_request_log]
(default: true
): enable request log
default[:solrcloud][:solr_config][:solrcloud][:zk_host]
(default: []
): zookeeper servers, e.g. ["server:port", "server:port"]
With attribute `default[:solrcloud][:zk_run]`, this attribute will get local zookeeper server.
default[:solrcloud][:java_options]
(default: []
): java options
default[:solrcloud][:auto_java_memory]
(default: true
): enable auto java memory allocation, sets java attribute -Xmx
for node[:solrcloud][:java_options]
This option calculates maximum allowed memory (multiple of 1024) for java process with minimum system memory reservation defined by attribute `node[:solrcloud][:auto_system_memory]`
default[:solrcloud][:auto_system_memory]
(default: 768
): memory to preserve for OS, required when attribute default[:solrcloud][:auto_java_memory]
is set
default[:solrcloud][:install_java]
(default: true
): setup java, disable to manage java outside of this cookbook
default[:solrcloud][:context_name]
(default: solr
): default solr jetty context path value
default[:solrcloud][:force_zkconfigsets_upload]
(default: false
): if set, zkconfigset lwrp will always execute configSet upload to zookeeper even configSet exists or there is no update. This option is useful when configSet source directory is managed separately.
This attribute should be enabled for limited nodes in solrcloud cluster if possible.
default[:solrcloud][:user]
(default: solr
): solr service user
default[:solrcloud][:group]
(default: solr
): solr service group
default[:solrcloud][:user_home]
(default: nil
): solr service user home
default[:solrcloud][:setup_user]
(default: true
): manage solr user for solr service using solrcloud::user
cookbook
default[:solrcloud][:version]
(default: 5.1.0
): solr package version
default[:solrcloud][:major_version]
(default: calculated
): solr package major version to configure solr 4 / 5, valid values - 4 5
default[:solrcloud][:server_base_dir_name]
(default: calculated
): solr base directory to configure solr 4 / 5, valid values - example server
default[:solrcloud][:zk_run_data_dir]
(default: node[:solrcloud][:install_dir]/zookeeperdata
): embedded zookeeper data directory
default[:solrcloud][:zk_run_port]
(default: 2181
): embedded zookeeper port
default[:solrcloud][:install_dir]
(default: /usr/local/solr
): jetty home directory - jetty.home
default[:solrcloud][:data_dir]
(default: /opt/solr
): solr collection data directory - solr.data.dir
solrconfig.xml for each configSet needs to set dataDir for this location usage, like:
<dataDir>${solr.data.dir:}/collection name</dataDir>
default[:solrcloud][:solr_home]
(default: node[:solrcloud][:install_dir]/solr
): solr home
default[:solrcloud][:cores_home]
(default: node[:solrcloud][:solr_home]/cores
): solr collection/core home
default[:solrcloud][:shared_lib]
(default: node[:solrcloud][:install_dir]
/lib): solr default lib directory
default[:solrcloud][:config_sets]
(default: node[:solrcloud][:solr_home]/configsets
): solr cores configSets directory
default[:solrcloud][:service_name]
(default: solr
): solr service name
default[:solrcloud][:service_start_wait]
(default: 15
): solr server after start up wait time
default[:solrcloud][:dir_mode]
(default: 0755
): solr resource default directory
default[:solrcloud][:pid_dir]
(default: /var/run/solr
): solr pid directory
default[:solrcloud][:log_dir]
(default: /var/log/solr
): solr log directory
default[:solrcloud][:template_cookbook]
(default: solrcloud
): solr template resources cookbook
default[:solrcloud][:zkconfigsets_cookbook]
(default: solrcloud
): zookeeper configSet cookbook
default[:solrcloud][:zkconfigsets_home]
(default: /usr/local/solr_zkconfigsets
): configs location for zookeeper configSet upconfig
default[:solrcloud][:zookeeper][:version]
(default: 3.4.6
): zookeeper package setup for zkCli.sh
default[:solrcloud][:limits][:memlock]
(default: unlimited
): solr service user memory limit
default[:solrcloud][:limits][:nofile]
(default: 48000
): solr service user file limit
default[:solrcloud][:limits][:nproc]
(default: unlimited
): solr service user process limit
default[:solrcloud][:log4j][:level]
(default: 10MB
): solr log threshold
default[:solrcloud][:log4j][:console]
(default: false
): enable/disable CONSOLE log
default[:solrcloud][:log4j][:max_file_size]
(default: 10MB
): maximum log file size
default[:solrcloud][:log4j][:max_backup_index]
(default: 10
): log files retention
default[:solrcloud][:log4j][:conversion_pattern]
(default: '%d{ISO8601} [%t] %-5p %c{3} %x - %m%n'
): log conversion pattern
default[:solrcloud][:request_log][:retain_days]
(default: 10
): request log files retention
default[:solrcloud][:request_log][:log_cookies]
(default: false
): enable log cookies
default[:solrcloud][:request_log][:time_zone]
(default: UTC
): request log time zone
default[:solrcloud][:jetty_config][:server][:min_threads]
(default: 10
): minimum jetty threads
default[:solrcloud][:jetty_config][:server][:max_threads]
(default: 10000
): maximum jetty threads
default[:solrcloud][:jetty_config][:server][:detailed_dump]
(default: false
): enable jetty detailed dump
default[:solrcloud][:jetty_config][:connector][:stats_on]
(default: true
): enable statistics
default[:solrcloud][:jetty_config][:connector][:max_idle_time]
(default: 50000
): max idle time for connector (http)
default[:solrcloud][:jetty_config][:connector][:low_resource_max_idle_time]
(default: 1500
):
default[:solrcloud][:jetty_config][:ssl_connector][:need_client_auth]
(default: false
): enable client ssl authentication, this feature is not tested yet
default[:solrcloud][:jetty_config][:ssl_connector][:max_idle_time]
(default: 30000
): jetty ssl maximum idle time
default[:solrcloud][:key_store][:manage]
(default: true
): generate key store for node key store attribute (enabled for testing purpose)
default[:solrcloud][:key_store][:key_store_file]
(default: solr.keystore
): key store file name, file location - node.solrcloud.install_dir/resources/etc/
default[:solrcloud][:key_store][:key_store_password]
(default: ``): key store password
default[:solrcloud][:key_store][:cookbook]
(default: solrcloud
): jetty ssl key store source cookbook, required is cookbook filekey store file management is disabled. Typical for Production environment.
default[:solrcloud][:key_store][:key_algo]
(default: RSA
): key store Algorithm
default[:solrcloud][:key_store][:cn]
(default: localhost
): key store CN
default[:solrcloud][:key_store][:ou]
(default: ApacheSolrCloudTest
): key store OU
default[:solrcloud][:key_store][:o]
(default: lucene.apache.org
): key store O
default[:solrcloud][:key_store][:c]
(default: US
): key store C
default[:solrcloud][:key_store][:ext]
(default: san=ip:127.0.0.1
): key store ext params
default[:solrcloud][:key_store][:validity]
(default: 999999
): key store validity
default[:solrcloud][:jmx][:port]
(default: 1099
): jmx port
default[:solrcloud][:jmx][:users]
(default: users - solrmonitor solrconfig
): jmx defaults users and roles, this feature is not tested yet
default[:solrcloud][:jetty_config][:context][:path]
(default: /solr
): solr default context path
default[:solrcloud][:jetty_config][:context][:temp_directory]
(default: /solr-webapp
): solr webapp directory
default[:solrcloud][:jetty_config][:context][:war]
(default: /webapps/solr.war
): jetty webapp solr war file location
solr.xml Reference: https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml
default[:solrcloud][:solr_config][:admin_handler]
(default: org.apache.solr.handler.admin.CoreAdminHandler
): solr.xml solr param adminHandler
default[:solrcloud][:solr_config][:admin_path]
(default: /solr/admin
): solr.xml param adminPath
default[:solrcloud][:solr_config][:core_load_threads]
(default: 3
): solr.xml solr param coreLoadThreads
default[:solrcloud][:solr_config][:core_root_directory]
(default: node[:solrcloud][:cores_home]
): solr.xml solr param coreRootDirectory
default[:solrcloud][:solr_config][:shared_lib]
(default: node[:solrcloud][:shared_lib]
): solr.xml solr param sharedLib
default[:solrcloud][:solr_config][:management_path]
(default: nil
): solr.xml solr param managementPath
default[:solrcloud][:solr_config][:share_schema]
(default: false
): solr.xml solr param shareSchema
default[:solrcloud][:solr_config][:transient_cache_size]
(default: 1000000
): solr.xml solr param transientCacheSize
default[:solrcloud][:solr_config][:solrcloud][:host_context]
(default: solr
): solr.xml param solrcloud hostContext
default[:solrcloud][:solr_config][:solrcloud][:distrib_update_conn_timeout]
(default: 1000000
): solr.xml param solrcloud distribUpdateConnTimeout
default[:solrcloud][:solr_config][:solrcloud][:distrib_update_so_timeout]
(default: 1000000
): solr.xml param solrcloud distribUpdateSoTimeout
default[:solrcloud][:solr_config][:solrcloud][:leader_vote_wait]
(default: 1000000
): solr.xml param solrcloud leaderVoteWait
default[:solrcloud][:solr_config][:solrcloud][:zk_client_timeout]
(default: 15000
): solr.xml param solrcloud zkClientTimeout
default[:solrcloud][:solr_config][:solrcloud][:zk_host]
(default: []
): zookeeper servers, e.g. ["server:port", "server:port"]
default[:solrcloud][:solr_config][:solrcloud][:generic_core_node_names]
(default: true
): solr.xml param solrcloud genericCoreNodeNames
default[:solrcloud][:solr_config][:shard_handler_factory][:socket_timeout]
(default: 0
): solr.xml param shardHandlerFactory socketTimeout
default[:solrcloud][:solr_config][:shard_handler_factory][:conn_timeout]
(default: 0
): solr.xml param shardHandlerFactory connTimeout
default[:solrcloud][:solr_config][:logging][:enabled]
(default: false
): solr.xml param logging enabled, not required
default[:solrcloud][:solr_config][:logging][:logging_class]
(default: nil
): solr.xml param logging class, not required
default[:solrcloud][:solr_config][:logging][:watcher][:logging_size]
(default: 1000
): solr.xml param logging size, not required
default[:solrcloud][:solr_config][:logging][:watcher][:threshold]
(default: INFO
): solr.xml param logging threshold, no required
default[:solrcloud][:hdfs][:enable]
(default: false
): to run solrcloud on hdfs, set it to true
default[:solrcloud][:hdfs][:directory_factory]
(default: HdfsDirectoryFactory
): solr hdfs directory factory
default[:solrcloud][:hdfs][:lock_type]
(default: hdfs
): sold hdfs lock type
default[:solrcloud][:hdfs][:hdfs_home]
(default: nil
): syntax: 'hdfs://host:port/path'
Note: SolrCloud on HDFS Deployment using this cookbook is not yet tested, check online solr on hdfs for more info
ulimit
cookbookjava
cookbookTo deploy solrcloud using this cookbook, below items are required:
Directory Structure
SorlCloud configSet stored in zookeeper are configured as file resources.
Each configSet is stored under node[:solrcloud][:zkconfigsets_cookbook]/files/default/configSet name
.
configSet folder follows the standard of having a conf
folder with all the configuration files.
So, the directory structure will look like - node[:solrcloud][:zkconfigsets_cookbook]/files/default/configSet name/conf
.
Managing same configSet for Multiple Environments
Managing configSet configuration across environments can be achieved in different ways, like
node[:solrcloud][:zkconfigsets_cookbook]
per environment
ORSimply, update node[:solrcloud][:zkconfigsets_cookbook]
attribute with your configSet cookbook and update metadata.rb
file with line:
'depends node[:solrcloud][:zkconfigsets_cookbook]
'.
Adjust the attributes according to your requirement. Below mentioned attributes will work just fine for a single node solrcloud cluster.
"default_attributes": {
"solrcloud": {
"zk_run": true,
"port": "8080",
"setup_user": true,
"manager": true,
"zkconfigsets": {
"samplecollection": {}
},
"collections": {
"samplecollection": {
"collection_config_name": "samplecollection"
}
}
}
}
Below attributes are crucial for Multi Node Cluster. It is not advised to enable below solrcloud attributes on all the nodes in the cluster. Like, each new node will trigger a zookeeper configset re-upload. Creating new collection is better off maanged by one node to prevent a false collection state in the cluster.
"default_attributes": {
"solrcloud": {
"manage_collections": true,
"manage_zkconfigsets": true,
Adjust the attributes according to your requirement. Below mentioned attributes will work just fine for a single node solrcloud cluster.
"default_attributes": {
"solrcloud": {
"solr_config": {
"solrcloud": {
"zk_host": [
"zookeeper_ip:zookeeper_port"
]
}
},
"port": "8080",
"setup_user": true,
"manage_collections": true,
"manage_zkconfigsets": true,
"zkconfigsets": {
"samplecollection": {}
},
"collections": {
"samplecollection": {
"collection_config_name": "samplecollection"
}
}
}
}
Note: You might want to enable attribute
"manager": true
on limited cluster nodes. In a large cluster, enabling this value on limited nodes would create less overhead for zookeeper.
Adjust the attributes according to your requirement. Below mentioned attributes will work just fine for a single node solrcloud cluster.
On any one
of the cluster node, enable attribute node[:solrcloud][:zk_run]
and use its ip address as zookeeper server.
"default_attributes": {
"solrcloud": {
"solr_config": {
"solrcloud": {
"zk_host": [
"instance_with_zk_run_ip:zookeeper_port_default_2181"
]
}
},
"port": "8080",
"setup_user": true,
"zkconfigsets": {
"samplecollection": {}
},
"collections": {
"samplecollection": {
"collection_config_name": "samplecollection"
}
}
}
}
To deploy multiple clusters, simply create multiple roles with different zookeeper server or update node attribute with respective cluster zookeeper server(s).
Zookeeper server attribute - node[:solrcloud][:solr_config][:solrcloud][:zk_host]
SolrCloud on HDFS has not been tested yet, but configuration from Apache Solr documentation has been added to the cookbook.
Some of the common java options tuning by Shawn Heisey.
Node attributes:
"default_attributes": {
"solrcloud": {
"java_options": [
"-Xms1024m",
"-XX:+UseConcMarkSweepGC",
"-XX:CMSInitiatingOccupancyFraction=75",
"-XX:NewRatio=3",
"-XX:MaxTenuringThreshold=8",
"-XX:+CMSParallelRemarkEnabled",
"-XX:+ParallelRefProcEnabled",
"-XX:+AggressiveOpts"
]
}
}
add_component_x
)rake
), ensuring they all passREADME.md
Authors:: Virender Khatri and Contributors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.