This preview shows page 585 - 587 out of 674 pages.
Cassandra stores memtables either on the Java heap, off-heap (native) memory, or both.The limits on heap and off-heap memory can be set via the propertiesmemtable_heap_space_in_mbandmemtable_offheap_space_in_mb, respectively. Bydefault, Cassandra sets each of these values to 1/4 of the total heap size set in thecassandra-env.shfile. Allocating memory for memtables reduces the memory availablefor caching and other internal Cassandra structures, so tune carefully and in smallincrements.You can influence how Cassandra allocates and manages memory via thememtable_allocation_typeproperty. This property configures another of Cassandra’spluggable interfaces, selecting which implementation of the abstract classorg.apache.cassandra.utils.memory.MemtablePoolis used to control the memory usedby each memtable. The default valueheap_bufferscauses Cassandra to allocatememtables on the heap using the Java New I/O (NIO) API, whileoffheap_buffersusesJava NIO to allocate a portion of each memtable both on and off the heap. Theoffheap_objectsuses native memory directly, making Cassandra entirely responsible formemory management and garbage collection of memtable memory. This is a less-welldocumented feature, so it’s best to stick with the default here until you can gain moreexperience.Another element related to tuning the memtables ismemtable_flush_writers. Thissetting, which is 2 by default, indicates the number of threads used to write out thememtables when it becomes necessary. If your data directories are backed by SSD, youshould increase this to the number of cores, without exceeding the maximum value of 8. Ifyou have a very large heap, it can improve performance to set this count higher, as thesethreads are blocked during disk I/O.You can also enable metered flushing on each table via the CQLCREATE TABLEorALTERTABLEcommand. Thememtable_flush_period_in_msoption sets the interval at which thememtable will be flushed to disk.Setting this property results in more predictable write I/O, but will also result in moreSSTables and more frequent compactions, possibly impacting read performance. Thedefault value of0means that periodic flushing is disabled, and flushes will only occurbased on the commit log threshold or memtable threshold being reached.
Commit LogsThere are two sets of files that Cassandra writes to as part of handling update operations:the commit log and the SSTable files. Their different purposes need to be considered inorder to understand how to treat them during configuration.Remember that thecommit logcan be thought of as short-term storage that helps ensurethat data is not lost if a node crashes or is shut down before memtables can be flushed todisk. That’s because when a node is restarted, the commit log gets replayed. In fact, that’sthe only time the commit log is read; clients never read from it. But the normal writeoperation to the commit log blocks, so it would damage performance to require clients towait for the write to finish.