WhyWeChoseCppOverJava
Why We Chose C++ Over Java
This document is to clarify our position regarding C++ vs. Java for choice of implementation language. There are two fundamental reasons why C++ is superior to Java for this particular application.
- Hypertable is memory (malloc) intensive. Hypertable caches all updates in an in-memory data structure (e.g. stl map). Periodically, these in-memory data structures get spilled to disk. These spilled disk files get merged together to form larger files when their number reaches a certain threshold. The performance of the system is, in large part, dictated by how much memory it has available to it. Less memory means more spilling and merging which increases load on the network and underlying DFS. It also increases the CPU work required of the system, in the form of extra heap-merge operations. Java is a poor choice for memory hungry applications. In particular, in managing a large in-memory map of key/value pairs, Java's memory performance is poor in comparison with C++. It's on the order of two to three times worse (if you don't believe me, try it).
- Hypertable is CPU intensive. There are several places where Hypertable is CPU intensive. The first place is the in-memory maps of key/value pairs. Traversing and managing those maps can consume a lot of CPU. Plus, given Java's inefficient use of memory with regard to these maps, the processor caches become much less effective. A recent run of the tool Calibrator (http://monetdb.cwi.nl/Calibrator/) on one of our 2GHz Opterons yields the following statistics:
caches:
level size linesize miss-latency replace-time
1 64 KB 64 bytes 6.06 ns = 12 cy 5.60 ns = 11 cy
2 768 KB 128 bytes 74.26 ns = 149 cy 75.90 ns = 152 cy
文章来源:http://www.w3cool.com/2008/07/05/hypertablecjava.html