Apache Cassandra
関連記事
Apache Cassandra はリニアにスケールする NoSQL データベースです。詳しくは 公式ドキュメント を読んでください。
目次
インストール
cassandraAUR パッケージをインストールしてください。
Configuration
Systemd unit
journald へのロギング
パッケージはデフォルトで /var/log/cassandra/system.log
にログを記録しています。代わりに journald にログを記録するには、ユニットを 編集 して、ExecStart
行に -f
を追加してサービスをフォアグラウンドで実行するように設定して、プロセスがフォークしないように Type を simple
に設定する必要があります。
これは systemd ドロップインファイル を使っても可能です。
/etc/systemd/system/cassandra.service.d/override.conf
[Service] Type=simple ExecStart= ExecStart=/usr/bin/cassandra -p /run/cassandra/cassandra.pid -f
Cassandra が起動していた場合は、Cassandra の電源を切り、再起動 する必要があります。
$ nodetool drain
cassandra.yaml
There is copious amounts of documentation in the default cassandra.yaml
. When installed via the cassandraAUR package, it is located in /etc/cassandra/cassandra.yaml
Basic config items to change
Setting the name of the cluster. This needs to be consistent for all nodes that you intend to have in this cluster.
cluster_name: 'Test Cluster'
Set the directory where cassandra will write too, below is the default that will be used if unset. If possible set this to a disk used only for storing cassandra data
data_file_directories: - /var/lib/cassandra/data
For the first node (the seed node) make sure to include its IP address in the seeds, and atleast 1 other node. for all other nodes, try and set a broad range of nodes in the cluster. If a node cannot connect to one of the seeds listed in this configuration at startup - it will fail to start.
seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: "192.168.1.53, 192.168.1.52"
set this based on what type of disk cassandra is using to store data on ssd
or spinning
disk_optimization_strategy: ssd|spinning
This is the address Cassandra will listen for client connections on
listen_address: 192.168.1.51
This is the address this node will advertise itself as, ensure both your clients and nodes can reach this node on this address
broadcast_address: 192.168.1.51
This is the address used for thrift connections, set to 0.0.0.0
it will listen on all interfaces, which is fine as long as its firewalled for security
rpc_address: 0.0.0.0
Recommended settings for linux specifically
hsha stands for "half synchronous, half asynchronous." All thrift clients are handled asynchronously using a small number of threads that does not vary with the amount of thrift clients (and thus scales well to many clients). This is not recommended on windows machines hsha is about 30% slower
rpc_server_type: hsha
Because we're using hsha, rpc_max_threads
must be set, or cassandra will refuse to start. rpc_max_threads
represents the maximum number of client requests this server may execute concurrently.
rpc_max_threads: 100
Troubleshooting
If Cassandra fails to run as a service, try running Cassandra
$ cassandra
If you receive the following error:
Improperly specified VM option 'ThreadPriorityPolicy=42' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
Cassandra only runs on Java 8. You will need to install Java per directions here Java to install Java 8 and switch your jvm using `archlinux-java`