I've just started working on a new project. The software handles some 39 million transactions per day and must store 3Gb per day (it already stores 4Tb). Essentially, it's a repository of trade and market data that must handle 8 million saves, 26 million loads and nearly a million queries every day.
One of the core technologies is Oracle's Coherence, an in-memory data grid. We use it over 8 different boxes with 5 different instances on each (plus a contingency environment). Since I am new to Coherence, I've been taking notes that I am publishing here (with extensive references to Oracle Coherence 3.5 event though we're using 3.7).
Cache Topologies
Near Cache - "A near cache is a hybrid, two-tier caching topology that uses a combination of a local, size-limited cache in the front tier, and a partitioned cache in the back tier to achieve the best of both worlds: the zero-latency read access of a replicated cache and the linear scalability of a partitioned cache". Has invalidation strategies: none, present, all, auto.
Continuous Query Cache - "very similar to a near cache [but] ... populates its front cache based on a query as soon as it is created; ... registers a query-based listener with the back cache, which means that its contest change dynamically as the data in the back cache changes... Basically CQC allows you to have a live dynamic view of the filtered subset of data in a partitioned cache.
Replicated - "Simply replicates all data to all cluster nodes".
Optimistic - Like replicated only with no concurrency control.
Partitioned - "Uses a divide and conquer approach... [it] is reminiscent of a distributed hash map. Each node in the cluster becomes responsible for a subset of cache partitions (buckets), and the Partitioned Cache service uses an entry key to determine which partition (bucket) to store the cache entry in".
Local - totally contained within a particular cluster node.
Replicated - "Simply replicates all data to all cluster nodes".
Optimistic - Like replicated only with no concurrency control.
Partitioned - "Uses a divide and conquer approach... [it] is reminiscent of a distributed hash map. Each node in the cluster becomes responsible for a subset of cache partitions (buckets), and the Partitioned Cache service uses an entry key to determine which partition (bucket) to store the cache entry in".
Local - totally contained within a particular cluster node.
Remote - Any out of process cache accessed by a Coherence*Extend client. All cache requests are sent to a Coherence proxy where they are delegated to one of the other Coherence cache types (Repilcated, Optimistic, Partitioned).
Backing Maps
There are "where cache data within the cluster is actually stored."
- Local cache - "a backing map for replicated and partitioned caches and as a front cache for near and continuous query caches. It stores all the data on the heap, which means that it provides by far the fastest access speed, both read and write compared to other backing map implementations."
- External backing map - "allows you to store cache items off-heap, thus allowing far greater storage capacity, at the cost of somewhat-to-significantly worse performance. There are several plugagable storage strategies... These strategies are implemented as storage managers." These include NIO Memory Manager, NIO File Manager, Berkley DB Store Manager and "a wrapper storage manager that allows you to make write operations asynchronous for any of the store managers listed earlier".
- Paged external backing map - "very similar to the external backing map... The big difference between the the two is that a paged external backing map uses paging to optimize LRU eviction".
- Overflow backing map - "a composite backing map with two tiers: a fast, size-limited, in-memory front tier, and a slower, but potentially much larger back tier on a disk."
- Read-write backing map - "another composite backing map implementation... the read-write backing map has a single internal cache (usually a local cache) and either a cache loader, which allows it to load data from the external data source on cache misses, or a cache store, which also provides the ability to update data in the external data store on cache puts."
- Partitioned backing map - "contains one backing map instance for each cache partition, which allows you to scale the cache size simply by increasing the number of partitions for a given cache".
Some important classes and interfaces
PortableObject - "The PortableObject interface is implemented by Java classes that can self- serialize and deserialize their state to and from a POF data stream."
NamedCache - "A NamedCache is a Map that holds resources shared among members of a cluster. These resources are expected to be managed in memory, and are typically composed of data that are also stored persistently in a database, or data that have been assembled or calculated at some significant cost, thus these resources are referred to as cached."
InvocableMap - "An InvocableMap is a Map against which both entry-targeted processing and aggregating operations can be invoked. While a traditional model for working with a Map is to have an operation access and mutate the Map directly through its API, the InvocableMap allows that model of operation to be inverted such that the operations against the Map contents are executed by (and thus within the localized context of) a Map. This is particularly useful in a distributed environment, because it enables the processing to be moved to the location at which the entries-to-be-processed are being managed, thus providing efficiency by localization of processing."
A recurring message in the JavaDocs:
Note: When using the Coherence Enterprise Edition or Grid Edition, the Partitioned Cache implements the InvocableMap interface by partitioning and localizing the invocations, resulting in extremely high throughput and low latency. When using Coherence Standard Edition, the InvocableMap processes the invocations on the originating node, typically resulting in higher network, memory and CPU utilization, which translates to lower performance, and particularly when processing large data sets.