What is Apache Cassandra?
NoSQL database
Schema-free
Very fast ‘write’
Cassandra is designed to handle big data workloads across multiple nodes with no single point of failure
Cassandra addresses the problem of failures by employing a peer-to-peer distributed system across homogeneous nodes where data is distributed among all nodes in the cluster
High Level Architecture of Apache Cassandra

Performance tips – 1 (DB Modeling)
Model your DB carefully
Need to understand your application and access patterns
DB is modelled around access patterns, which is very different from RDBMS world
Use demoralization, super column family and such feature’s to overcome costly joins of RDBMS world
Performance tips – 2 – 4 (CF settings)
Column family parameters impacting performance
Keys_cached – Number of keys to be cached
Row_cached – Number o rows, to be cached (need more memory, but can improve performance drastically)
Preload_row_cache – whether to prepopulate row cache on startup
Large MemTables – can improve read performance (has 4-5-settings related to memTables), including memtable_flush_writes (number of threads)
Each CF is stored on a disk its own separate file, so keep related columns in the same CF, and SCF can come real handy
Gc_grace_seconds – Time to wait before removing tombstones.
Performance tips – 5-6 (concurrent read writes)
Concurrent_reads (By defaults this is 32) : A good value is 4 concurrent_reads per processor core. Increase this value for systems with fast I/O storage.
Concurrent_write (By default this is 32) : If needed, increase this value for system with many core, but out of box setting work fine for most of the requirements, as write is usually very fast.
Performance tips – 7 – 9 (commitlog)
Commitlog_rotation_threshold_in_mb – how large commit log can grow before new file is created
Commitlog_sync – Can be Periodic or Batch. Using batch can reduce performance as it blocks until write operation is synced to disk .
Separate disk for commit logs, to reduce IO contention during write.
Performance tips – 9-10 ( compression )
Use compressions – Depending on the data characteristics of the table, compressing its data can result in: - 2x-4x reduction in data size - 25-35% performance improvement on reads - 5-10% performance improvement on writes
Concurrent_compactors – Sets the number of concurrent compaction processes allowed to run simultaneously on a node
Performance tips – 11-13 (timeouts)
Range_request_timeout_in_ms – The time that the coordinator waits for sequential or index scans to complete
Read_request_timeout_in_ms – The time that the coordinator waits for read operations to complete
Write_request_timeout_in_ms – The time that the coordinator waits for writes to complet
Performance tips – 14-15 (RPC settings for clients)
Rpc_keepalive – Enable or disable keepalive on client connections
Rpc_max_threads – number of maximum requests in the RPC thread pool dictates how many concurrent requests are possible
Performance tips – Misc.
Dynamic Snitching – dynamic snitch monitors read latency and, when possible, routes requests away from poorly-performing nodes
Bloom filter (bloom_filter_fp_chance – Desired false-positive probability for SSTable), higher settings use less memory, but will result in more disk I/O if the SSTables
Keep an eye on compaction (compaction_throughput_mb_per_sec, it is used for throttling compaction overheads)
Performance tips – JVM
JVM tuning is applicable
Heap min and max size – set both as same
Assertions – disable assertions while launchig JVM
Survivor Ratio, MaxTenuringThreshold and GC Algo
Hope it worked for you !!