MLC Flash for Big Data Acceleration

Big data analysis demands bandwidth and concurrent access to stored data. Write load will depend on data ingest rates and batch processing demands. The data involved will typically be new data and updates of existing data. Indices and other metadata may be recalculated, but is generally not done in real time. The economics of supporting such workloads focus on the ability to cost effectively provide bulk access for concurrent streams. If only a single stream is being processed, spinning disk is fine. However, providing highly concurrent access to the dataset requires either a widely-striped caching solution or a clustered architecture with local disk (Hadoop). Because write lifetimes for flash are not stressed in this environment, utilizing wide stripes of MLC for caching is the most cost-effective way to provide highly concurrent access to the dataset in a shared-storage environment.

Now, a lot of the SLC versus MLC debate centers on blocking and write performance – specifically dealing with write latency and the blocking impact on reads. With traditional storage layout, data can be striped over only a few disks (4 data disks for stripes of RAID 5/6). This creates high read blocking probability for even the smallest write loads. By distributing the data over very wide non-RAID stripes (up to 40 disks wide), the affect of variable write latency can be mitigated by dynamically selecting least-read disks for new cache data and greatly reducing the impact of writes on the general read load. The wider the striping of physical disks in a caching media the greater the support for concurrent access and mixed read and write loads from the application. MLC is an excellent media choice, both technically and economically.

By employing affordable MLC as a write-through caching layer that is consistent with the backend storage, the effect of even multiple simultaneous flash SSD failures can be removed. Most traditional storage systems cannot survive multiple concurrent drive failures and suffer significant performance degradation when recovering (rebuilding) from a single device failure. Cache systems can continue operation in the face of cache media failures by simply fetching missing data from the storage system and redistributing to other caching media. However, it’s important to note that placing the cache in front of the storage controller is critical to achieving concurrency. The storage controller lacks the horsepower necessary to sustain performance – but that’s a topic for another day.

MLC is driving the price point of Flash towards that of enterprise high-performance spinning disk. The constant growth in the consumer space means that MLC will continue to be the most cost-effective flash technology and benefit the most from technology scaling and packaging innovations. Lower volume technologies such as eMLC and SLC do not share the same economic drivers and thus will continue to be much more expensive. The ability to utilize MLC efficiently and adapt the technology to meet the performance and access needs of Big Data will be hugely advantageous to customers and the vendors who can deliver intelligent, cost-effective solutions that utilize MLC – such as the GridIron TurboCharger™!

Un-structured, structured and relational data – how big is big?

So now that we are definitely in love with big data…how big does it have to be before we really consider it big?

Well…it depends.

Something is really not that big if it’s just sitting there and you are not hauling it around.

See – when Mr. Neumann put down the seminal architecture for programmed computers - he definitely chose sides! The ‘program‘ was the quarterback – and data played a decidedly subservient role – always at the beck and call to be hauled and mauled as programs saw fit.

Programs ‘fetch’ data – at their leisure, at their chosen time.

Even the operating term sounds more fitting for your Corgi than someone or something more serious!

So we have been writing code that merrily ‘fetches’ data and processes it. Works for most programs. Except when data grows. And grows. And grows…

It grows until it starts to be a real problem to just ‘fetch’ it. And it becomes a real pain to move it around. How you have to think about -

  1. perhaps change roles and send the ‘program’ to data instead of the other way around
  2. how to be smart about moving ONLY the required amount of data

For a PC-XT with a whopping 10MB Hard drive big data was just 10MB. That was the entire drive! The little 8088 CPU running at 4.77MHz on a 8-bit bus could scream at 4.77 MB/sec and could finish scanning the disk (theoretically) in less than 2 seconds.

My desktop is running on an i7-2600 CPU with 4 hyperthreaded cores at 3.4 GHz. This beastie can scan my 2TB hard-drive at the rate of a little under 100GB/sec (again, theoretically) – taking 20 seconds to do the scan.

Let’s take a look at that workhorse of enterprise relational data cruncher – Oracle RAC. A state-of-the-art 4-node RAC system should be able to scan in data at the rate of 4 to 5 GB/sec from the storage – or in excess of 20 GB/second. At that rate – a database can load at the rate of 60+TB/hour. Throw in scheduling overhead, network latency, and error checks and you  are looking at 10 to 20 TB/hour. That’s a very impressive number – giving you head-spinning bragging rights in Oracle OpenWorld data warehouse tutorial sessions…

Now consider a 50TB data Warehouse; not too extreme by Oracle standards but now we are talking about 3 hours to JUST LOAD THE DATABASE.

We’re not just ‘fetching’ data anymore, are we?

50TB is “big data” for Oracle RAC, even more so for single-instance Oracle installations.

Even the ne-plus-ultra of NoSQL – Hadoop is not used in isolation. Typically a Hadoop processing stage is followed by HIVE or other structured databases – even MySQL.

So a big ‘unstructured’ data setup may as easily feed into a big ‘structured’ data analysis stage.  So how big do they typically get before the big data characteristics start to show (difficulty in fetching the entire data, sending program to stationary data, etc., etc.). Here is my take:

Hadoop – The top dogs may sneer at something below a petabyte – but in reality a 100TB Hadoop/NoSQL cluster is getting big. You can’t just deal with it casually and it demands attention in care and feeding.

MySQL cluster – A 100 node cluster in the size range of 100TB is certainly getting there.

Oracle including RAC - 50TB and up…especially DSS (Decision Support System) and Warehouses. Folks up in Amazon and Ebay run some very impressive big data warehouses in Oracle. Then there are installations at “those who shall not be named.”

Hadoop is loved because it’s (supposedly) an open ended framework when it comes to data size. Petabytes of data pouring in as concrete? No problem – just add more nodes as your data grows – no need to change your program – the same java code works. But remember the story of war elephants crossing Alps – just because Mr. H. Barca decided to do it does not mean that you should consider it easy. Tilting at a 1,000 node cluster with Hadoop is a day’s work for Google but not for a typical enterprise CIO.

We’ll explore challenges unique to big structured/relational data and big unstructured data in the coming posts…