I’ve written previously on transaction logging, the feature in eXtremeDB (and other database systems) that adds durability to an in-memory database, enabling recovery in the event of system failure. McObject has completed new benchmark tests focusing on transaction logging and performance, using EMC’s XtremSF server-based PCIe flash card technology.
The test results, presented in the free report Gaining an Extreme Performance Advantage, drive home three points: 1) an in-memory database system with transaction logging vastly outperforms a traditional “on-disk” DBMS; 2) this speed advantage is multiplied through careful choice of transaction log storage device; 3) McObject’s database system and EMC’s server-based PCIe flash cards reap the benefits of multi-threading software on multi-core CPUs – especially when the two vendors’ technologies are used together.
For the database operations most likely to induce latency, the eXtremeDB IMDS with transaction logging (IMDS+TL), and storing its transaction log on a hard disk drive (HDD), outperformed the on-disk DBMS w/ HDD storage by more than 5 times. Storing the transaction log using EMC’s technology improved this performance edge dramatically, driving the IMDS+TL to achieve a 2,100% speed advantage over the on-disk DBMS configuration.
A second phase of the benchmark tested scalability by measuring throughput while adding concurrent processes to interact with the database system and its storage. EMC’s and McObject’s technologies both leverage multi-threading on multi-core CPUs: in McObject’s tests using the persistent storage capabilities provided by eXtremeDB Fusion edition (that is, using eXtremeDB as a “pure” on-disk DBMS) and storing the database on EMC XtremSF, the benchmark application with 2 processes running simultaneously was able to complete 3.89 loops per millisecond; with 36 concurrent processes, this grew to 12.21 loops/ms – an increase of 314%.
However, when moving to the eXtremeDB In-Memory Database System with transaction logging, and using EMC XtremSF technology to store the log, the rate of throughput increase was even more impressive – loops per second increased by 505% when scaling to 36 processes from 2 processes.
About this blog
Musings on database technology
Entries in this blog
I’ve written previously on transaction logging, the feature in eXtremeDB (and other database systems) that adds durability to an in-memory database, enabling recovery in the event of system failure. McObject has completed new benchmark tests focusing on transaction logging and performance, using EMC’s XtremSF server-based PCIe flash card technology.
If slashing latency is the goal, main memory (DRAM) is definitely the place to store data. System architects recognize this, hence the soaring popularity of in-memory database systems (IMDSs).
But DRAM is volatile – some applications need greater durability in the event that someone pulls the plug on the system.
What if DRAM could be made persistent, to “freeze” in-memory data at the moment of system failure? That is the capability delivered by AgigA Tech’s AGIGARAM non-volatile DIMM (NVDIMM), which combines standard DRAM with NAND flash and an ultracapacitor power source. In the first test of its kind, McObject has successfully tested its eXtremeDB IMDS using AgigA Tech's AGIGARAM NVDIMM as main memory storage.
The tests included “pulling the plug” mid-execution, which confirmed the AGIGARAM product’s ability to save data persistently in the event of system failure, and to facilitate recovery. The benchmark tests also showed eXtremeDB’s speed managing data in AgigA Tech’s NVDIMM to be equal to using conventional memory (DRAM). McObject presents the tests results in a free report available here:
To understand the importance of these tests, consider the alternative methods for adding durability to an in-memory database. IMDSs typically support transaction logging, which records (to persistent media) changes to the database and can be used to recover the database after a crash. But this logging adds latency - elimination of which is typically why an IMDS is chosen in the first place!
Another option is to use DRAM backed up by a battery. However, disadvantages of battery-backed RAM include restrictive temperature requirements, leakage risk, limited storage time, long re-charge cycles, finite battery shelf life, and overall high cost-of-ownership.
In contrast, AGIGARAM and eXtremeDB together manage data at DRAM speed, but with persistence and none of the drawbacks of battery-backed RAM. This combination opens the door to a new and powerful approach to database-enabling applications that demand both speed and durability, including mission critical systems for telecom/networking, capital markets, aerospace and industrial control.
Main memory (DRAM) is the fastest storage medium for a database management system. But once you have a highly efficient in-memory database system, how do you reduce latency further?
Our new white paper, "Pipelining Vector-Based Statistical Functions for In-Memory Analytics", available for free download, explains McObject’s focus on optimizing CPU L1/L2 cache use in eXtremeDB Financial Edition.
The product’s two key features in this area are its support for columnar data handling (via its “sequence” data type), and the programming technique of pipelining using eXtremeDB Financial Edition’s library of vector-based statistical functions.
Columnar handling maximizes the proportion of relevant data brought into L1/L2 cache with every fetch.
Pipelining causes interim results to remain in L1/L2 cache during processing, rather than being output as temporary tables in main memory. Both columnar data handling and pipelining eliminate latency by minimizing transfers between main memory and L1/L2 cache.
McObject has released a new video.
The topic is eXtremeDB Financial Edition and the performance gains that can be achieved by 'pipelining’ that product’s vector-based statistical functions. The video has code-level details to interest programmers, and animations that explain important concepts – including row- vs. column-based data handling, and the benefits of pipelining to optimize L1/L2 CPU cache usage – at a higher level.
Pipelining is a powerful technique that should interest folks developing real-time capital markets applications (algo trading, risk management, quantitative analysis, etc.)
It’s on YouTube now:
By the way, have your checked out our first, introductory eXtremeDB Financial Edition video? It is accessible via YouTube at http://www.youtube.com/watch?v=FlsezbvWpbE.
In-memory database systems (IMDSs) offer breakthrough performance by eliminating I/O, caching, data transfer, and other overhead that is hard-wired into traditional "on-disk" database management systems (DBMSs). But some applications require a high level of data durability. In other words, what happens to the data if somebody pulls the plug?
As a solution, IMDSs offer transaction logging, in which changes to the database are recorded in a log that can be used to automatically recover the database, in the event of failure.
Wait a second, critics say — doesn't this logging simply re-introduce storage-related latency, which is the very thing IMDSs have "designed out" to gain their high performance? Won't an IMDS with transaction logging (IMDS+TL) perform about the same as an on-disk DBMS?
As a leading IMDS vendor, McObject has always offered logical (if pretty technical) explanations of why this is not the case, i.e. why IMDSs with transaction logging retain their speed advantage. But nothing beats cold, hard evidence — so we decided to test our own claims. We also wanted to see how different data storage technologies affected any performance difference, so we ran all the tests using hard-disk drive (HDD), solid state drive (SSD), and a state-of-the-art NAND flash memory platform (Fusion ioDrive2). In the case of the traditional DBMS, these devices stored database records and the transaction log. With the IMDS+TL, they stored the transaction log.
Let’s talk results first, and then the reasons why. When inserting records into a database, moving from an on-disk DBMS to an IMDS with transaction logging (but still storing the transaction log on HDD) delivered a 3.2x performance gain. In other words, you can have your data durability, and triple your speed (and then some) for inserts, even when using garden variety HDD storage.
It gets even better, though, using today’s faster storage options to hold the IMDS’s transaction log. For example, storing the transaction log on flash memory-based SSDs boosted IMDS+TL insert performance 9.69x over the DBMS writing to hard disk.
Next, we swapped in the Fusion ioDrive2 to store the transaction log, and racked up a 20.05x performance gain over the DBMS+HDD combination. That’s right, inserting records into the database became more than 2,000% faster, while retaining transaction logging’s compelling data durability benefit (compelling, that is, to anyone who has lost critical data in a system crash). The gain was even more dramatic for database deletes, at 23.19x.
Why is an in-memory database system with transaction logging so much faster than a disk-based DBMS for the most I/O-intensive operations? First, on-disk DBMSs cache large amounts of data in memory to avoid disk writes. The algorithms required to manage this cache are a drain on speed and an IMDS (with or without transaction logging) eliminates the caching sub-system. There are other reasons, too. (As promised, they are technical.) For a full discussion, download McObject's free white paper, "In Search of Data Durability and High Performance: Benchmarking In-Memory & On-Disk Databases with Hard-Disk, SSD and Memory-Tier NAND Flash."
The paper describes other test scenarios — and in all of them, the Fusion ioDrive2 dramatically outperformed the other storage devices. Other pages on this Web site explain the reasons more eloquently and completely than I can. My understanding as a "database guy" is that Fusion ioDrive2 differs from SSD storage in that it presents flash to the host system as a new memory tier, integrating flash close to the host CPU and eliminating hardware and software layers that would otherwise introduce latency by standing between CPU and SSD storage devices.
As someone who works with system designers seeking the fastest possible responsiveness in fields ranging from telecommunications to financial trading, I can tell you that the prospect of boosting performance in I/O-intensive (high overhead) operations by more than 2,000% certainly finds a receptive audience.
(This text originally appeared as a guest post on Fusion-io's blog)
Most people justifiably take technology vendors’ claims of blazing speed with a grain of salt. As a result, industries that live and die by IT performance have developed independent, audited benchmarks to enable “apples to apples” comparisons between competing solutions.
For capital markets technology – that is, for systems like algorithmic trading, risk analysis, and order matching – these tests are defined and managed by the Securities Technology Analysis Center (STAC). In STAC's own words, it "provides hands-on technology research and testing tools to the trading industry, focusing on the most challenging workloads." It also verifies test results, to make sure participants don’t cut corners.
McObject has increased its presence in the financial technology sector, recognizing an opportunity there for a fast, deterministic and affordable database system. We reached a key milestone last week with the release of eXtremeDB Financial Edition, which includes powerful, specialized features for managing market data. Our own tests showed eXtremeDB Financial Edition to be approximately three times faster performing the same tasks than a competing, specialized DBMS with a reputation for speed in financial applications.
However, we considered it critical to verify our performance using STAC’s audited benchmarking process – specifically, with the STAC-M3 benchmark suite, the trading industry’s gold standard for comparing performance in the management of time-series data such as tick-by-tick quote and trade histories.
I am pleased to announce that STAC has released its official report, confirming eXtremeDB Financial Edition best-in-category performance. The report – which STAC offers for free download (registration is required) – documents that eXtremeDB Financial Edition achieved the lowest mean response time ever published by STAC in 15 of 17 STAC-M3™ benchmark tests covering a wide range of key capital markets computing tasks.
Interestingly, McObject is only the second database system vendor to approve publication of its STAC-M3 results. We have it on good authority that several competing DBMS vendors in the financial arena underwrote STAC’s testing (it costs $5,000, plus a considerable investment in engineering and coordination with partners) but opted not to release their less-than-stellar results.
Speaking of partners, some impressive hardware helped eXtremeDB Financial Edition deliver the record-setting performance. McObject provided the database management software in a “system under test” consisting of a Dell R910 PowerEdge server with 40 Intel E7 cores and 512GB RAM and Kove® XPD™ L2 memory-disk SSD storage, mesh-connected via Mellanox ConnectX®-2 InfiniBand adapters through a Mellanox InfiniScale® IV QDR InfiniBand Switch to the Dell server. (The hardware is exactly the same as the previous performance record holder for the STAC-M3. We didn’t get better results by throwing faster hardware at it.)
The STAC-M3 benchmark is comprehensive in a way that our earlier, in-house tests (or probably any vendor’s tests) couldn’t approach. Its focus is real-world, based on complex queries that were designed by trading firms on the STAC Benchmark™ Council. A few highlights of eXtremeDB Financial Edition’s STAC-M3 results include:
- 9x the previously published best result on tests using the "Theoretical P&L" algorithm
- 7x the best previously published result for the "Year High Bid" algorithm
- Double the previously published best results for both “National Best Bid and Offer” (NBBO) and “Market Snapshot” algorithms
- Lower standard deviations of latency than the previously best results for 13 of the 17 operations.
This last result, record-setting low standard deviation, points to eXtremeDB Financial Edition’s high determinism – a characteristic that differs somewhat from pure speed. It means that financial calculations will be predictably fast. Capital markets developers tell us this is important, in that it means they can expect tighter bounds on application performance (performance will not swing wildly). Opportunities in algorithmic trading are fleeting, after all – and determinism means confidence that processing will not bog down at critical times.
Read McObject’s press release on eXtremeDB Financial Edition in the STAC-M3 benchmark.
Vicki Chan, U.S. reporter for Inside Market Data, recently broke the news that McObject is readying a database monitor for eXtremeDB, along with additional information about upcoming changes/improvements to eXtremeDB that we hope to roll out this quarter.
The database monitor will be an extension to database browser that is currently shipped in the ./samples/core/22-httpview directory. Developers can take advantage of a browser interface that we provide, for their internal use. Also provided is an API to gather statistics, enabling developers to create their own GUI, with any desired look-and-feel (and branding), for distribution with their application. (Like the eXtremeDB schema compilers, mcocomp and sql2mco, our GUI for the performance monitor is not a redistributable component and will not be included with evaluation copies of eXtremeDB.)
Here are a couple of screenshots of the monitor. These are not final designs, but capture the essence of what can be expected in the final version. Keep in mind that while these screenshots depict database connection and storage metrics, the actual monitor will track about two dozen statistics. These metrics will enable developers and end-users to optimize their applications by closely monitoring the database system, observing the effect of changes on transaction throughput and other key statistics, and making changes based on these observations.
Check back here for more sneak peeks at the upcoming release.
As an in-memory database system (IMDS) pioneer, it shouldn’t surprise anyone that McObject protects its turf. Early on, we published a whitepaper to highlight the significant differences between an in-memory database versus a database that happens to be in memory (a la deployed on a RAM-disk).
Due primarily to (1) the efforts of McObject (over 100 articles published by or about McObject since 2002), Polyhedra (acquired by ENEA), Solid (acquired by IBM) and TimesTen (acquired by Oracle) to raise awareness of in-memory database technology, and (2) the consistent reduction in cost-per-megabyte of RAM, and (3) the rise of new classes of applications with extremely demanding performance requirements, IMDSs have gained significant visibility in recent years – and their use across a wide range of application types, from avionics to financial systems, has increased.
We like this trend! But the “buzz” but has permitted some obfuscation as to what an IMDS is, and is not. Specifically, some database vendors have endeavored to jump on the bandwagon by “in-memory-enabling” their legacy on-disk database systems. We’ve responded to this “me too”-ism in two recent whitepapers, In-Memory Database Systems: Myths and Facts and Will The Real IMDS Please Stand Up?
Rather than rehashing these papers, let’s sum them up thusly: When you hatch a database management system, it will, by design and implementation, be either an in-memory database system or an on-disk database management system. The choice affects the fundamental optimization strategies that will be baked into the database system code. To optimize an on-disk database is to minimize disk I/O, so its designers and developers will use extra CPU cycles and extra memory if doing so will reduce or eliminate I/O. Conversely, IMDSs by definition eliminate all disk I/O; their optimization is all about delivering the highest performance at a given level of processing power, thus reducing demand for CPU cycles is a key objective. And since memory is storage for an IMDS, reducing memory overhead (i.e. RAM consumed for anything other than storing data) is also a key objective.
Those optimizations are diametrically opposed. So it follows that you cannot, 5, 10, 15 or 20 years after an on-disk DBMS was designed and developed, suddenly turn it into an in-memory database system and expect the same performance or efficient memory use as a database system written to be in memory in the first place. The on-disk design goals mentioned above were baked in a long time ago, and while you will end up with something that is faster than the original, it won’t be as fast or as efficient in its use of memory as a true IMDS, created from scratch with the appropriate set of design goals.
When it comes to evolving a product to meet market demands, an in-memory database vendor has an advantage. They can always choose to use more CPU cycles and/or memory in order to add a feature. So an IMDS can evolve into a hybrid in-memory/on-disk database system and the on-disk implementation can be every bit as good as the database system that was originally written to be on-disk. In other words, it’s easy to add ingredients to your pie (CPU cycles and memory consumption). But once they’re in the pie mix, you can’t really take those ingredients out, short of a rewrite.
A couple of months ago, I began a two-part series on the misconception that in-house or open source software are lower-cost alternatives to commercial software. In the /index.php?autocom=blog&blogid=1&showentry=8" target="_blank">first part, I addressed in-house/roll-your-own development. In this post, I’ll discuss open source. And, repeating my opening comments from the first post, this is not going to be a rant about/against open source software. McObject offers open source software in our Perst object-oriented embedded database system for Java and .NET.
Everybody loves to get something for nothing, or even a great bargain. That’s human nature. I could say something pithy like “You get what you pay for,” but there’s no assurance that would be any more true for open source software than for a digital TV that on Black Friday costs a fraction of its suggested retail price. The corollary to that statement is, of course, that in the case of open source software it might be true that “you get what you pay for.” All open source software is not created equal.
In addition, what cannot be disputed is that there is a cost of ownership for any piece of software, i.e. the cost of using that software over a given period of time. You have to consider the initial licensing cost but also the amount of time it will take you or your development team to learn how to use the software (including what it is and is not capable of), integrate it into the system under development, extend the functionality if necessary, port it to specific platforms, support it, and so on. These so-called hidden costs are likely to vary widely based on the type of open source software. Read on…
The term “open source” covers a lot of ground. It ranges from “free as in free beer”, to dual-license software such as our own Perst that can be used under the GPL, or with a commercial license from McObject if you can’t accept the GPL terms. There are also possibilities in between.
One important factor in deciding whether to use any piece of open source software is the motivation of its developers/maintainers. After all, it takes time for the original author(s) to create it, support it, and enhance it over time. So let’s consider a few possibilities.
Linux is probably the best-known open source software of all time. It was created by Linus Torvalds in 1991 to satisfy his particular need for a terminal emulator to access UNIX servers at the University of Helsinki with his 80386-based PC. It bore no resemblance to the Linux of today (in fact, wasn’t even called Linux, yet), but it developed a following on Usenet that ultimately broke the chains that bound it to the x86 architecture. Today, thousands of users contribute to Linux (albeit mostly by suggesting improvements to the maintainers), and it’s available on dozens of architectures. The Linux community is pretty robust: if you need to port Linux to some new architecture, you can probably find support, even if you choose not to procure Linux through one of the commercial Linux vendors like Red Hat.
Linux is almost unique among open source software in that an entire economy has developed around it. There are two other categories of open source software that are far more prevalent.
First, there is software from hobbyists, created to fill an immediate need, or because it was fun/challenging, and released under an open source license. There are many such open source “products” and really no support for them. The author might be receptive to answering a question or two, but has likely moved on to the next interesting project and/or has a full-time job.
Another motivation for open source software is epitomized by McObject and our Perst database system for Java and .NET. Perst generates a good bit of exposure for McObject, so it helps build our brand and our reputation in embedded database management system software. Some of that exposure leads to crossover sales of our flagship product, eXtremeDB. Perst also generates exposure through its use in other open source and commercial software, such as Jease and JadBlack. And Perst generates revenue for McObject through technical support agreements and sales of commercial licenses. So Perst is in one of a category of open source software products that offer both paid support, and free support through the community. You can also support it yourself (after all, you have the source code).
Another twist on the dual license approach was adopted by MySQL when it was an independent entity and has, so far, been retained by Oracle. As with Perst, you can buy support, depend on the community, and/or support it yourself. But another component of the MySQL revenue model is advanced features available only with a commercial license, such as the MySQL Cluster CGE (“Carrier Grade Edition”) version.
At the end of the day, using open source software is going to cost you something. To begin to analyze what that cost might amount to over the period of time you’ll be using it, and whether or not that cost is more, less, or equal to the cost of a competing commercial product, ask (1) where that piece of open source software falls in the continuum of possible types of open source software and the motivations of the authors, and (2) how closely the open source software matches your required functionality and what options are available to extend the functionality.
If the first case, you might be able to build a compelling business case if the software is widely used, has a robust community supporting it, or 3rd parties have emerged to offer commercial support. For open source software contributed by, e.g., a hobbyist, that business case can get harder to make.
In the second case, even if an open source software has a well-developed community, if you have to modify the source for any reason (add functionality, port to a proprietary platform, etc.) then you’re going to be largely on your own for support (and the business case begins to resemble OSS contributed by a hobbyist, with no ongoing development or support).
Above all, view with skepticism the claim that any software is “free.” Initial licensing charges are often a fraction of overall cost of ownership. If you’re embedding that software in your own technology, you may be living with its total costs for a long time.
First of all, this is not going to be a rant about/against open source software. McObject offers open source software in our Perst object-oriented embedded database system for Java and .NET.
Rather, this is going to be a two-part post on the misconception that in-house (AKA homegrown, AKA roll-your-own) or open source software are lower cost alternatives to commercial software (which is usually, but not always, closed source).
In this first part, I’ll address the in-house development misconception. At McObject, we bump up against this mindset most frequently in two settings: low cost offshore development, and the developers who believe that a commercial database is “overkill”, i.e. their data management needs are modest/simple and therefore the rigor of a database management system is unnecessary. The latter case seems to be more prevalent in the embedded systems markets than in the enterprise markets, probably because embedded systems developers are not accustomed to thinking in terms of database management – though they’re being driven in that direction by the ever-increasing complexity of embedded systems.
In the first case (low cost offshore development), there is often an mistaken belief that because offshore labor is inexpensive compared to U.S. or western European labor, there’s no cost advantage to buying off-the-shelf software. However, if you consider the facts the truth becomes evident:
-- the entry cost for a product like McObject’s eXtremeDB is as low as $4,000.00
-- several programmer-years went into the initial release of eXtremeDB in 2001, and tens of programmer-years since. Therefore, it’s impractical to think that equivalent functionality, or even a meaningful subset, can be created from scratch in the few programmer-months that would constitute ‘lower cost’ even at offshore compensation levels
-- it is not possible for de novo code to be as stable/reliable as code that has been in production and field-tested for 10 years
-- the prior point is a convenient segue to the observation that roll-your-own takes time to develop, and more time to test, whereas off-the-shelf software is usable immediately. In today’s fast-moving world, lost time equals lost opportunity
Okay, so we’ve established that roll-your-own is probably not cost-effective. But it gets worse. Let’s say you have a team of 5 developers. Even if just one of them is tied up developing a roll-your-own solution to some problem (it doesn’t have to be databases, after all), then 20% of the team is focused on work that
-- doesn’t differentiate your product from the competition’s, and
-- is not your organization’s core competency
As a businessperson, given that cost savings doesn’t justify it, how do you justify it?
The case for the embedded systems developer is similar, and easier to make if the developer is not in a low-wage locality. Even if the supposition that I posited earlier (a commercial database is “overkill”) is true, adopting a roll-your-own approach to implement the subset of functionality needed is cost-prohibitive.
Let’s do a little analysis of the hidden cost of roll-your-own.
A paper by Watts S. Humphrey, founder of the Software Engineering Institute and the Capability Maturity Model, examined the results of studies at IBM and TRW on the relative cost of correcting defects at various stages in the development cycle:
(The final data point of the TRW study was actually a range of 70 – 1000. I used the value 200 rather than the worst-case 1000 in order to keep the graph in a reasonable scale, and to be conservative.)
Humphrey also asserts that, historically, senior programmers insert 100 defects per 1000 lines of code (KLOC), half of which are found by the compiler. He finds that the time to find and correct each defect during the test stage ranges from 2 – 20 hours. Assuming a payroll of cost of $100,000 for a senior programmer, then each defect found during test has a cost of $96 - $960, and, applying the relative cost figures from the IBM and TRW studies, $160 - $3,200 in the field.
If we use some representative figures for KLOC in a project of modest size, we can arrive at some concrete costs for roll-your-own. Let’s assume that the code size to implement the data management requirements of the project is 25,000 lines of code (not an unreasonable assumption given the approximately 60,000 lines of code in the core of eXtremeDB, i.e. excluding Fusion, High Availability, eXtremeSQL and the other advanced options). Using Watts Humphrey’s figures, there will be 2,500 defects, half of which will be found by the compiler, leaving 1,250 defects to be discovered and remediated no sooner than “during test”. This leads to a minimum cost of 1,250 X $96 = $120,000.00 if the average cost to fix is at the low end of the $96 - $960 range and the defects are all found “during test” and not later.
Conversely, already-debugged off-the-shelf software should inject no new defects into the project. Some might argue that using the API of an off-the-shelf software package also means writing a bunch of lines of code to use that API, and that this will also introduce bugs. But this misses the point that, regardless of whether the data management code is roll-your-own or off-the-shelf, application code still has to written to use that data management code. So the argument is a non-starter: you’re going to write the application code, either way.
This cost of remediating defects, detailed above, illustrates the tremendous savings likely to accrue from using off-the-shelf software for any modestly complex and demanding function. Exactly where this tipping point lies depends on the variables in the cost and complexity equation (i.e. labor cost and volume of code that will need to be written to meet the functionality required). Generally speaking, it is probably best to stick to solving line-of-business problems, versus solving general computing problems that have already been solved, like operating systems, database systems, web servers, and the like.
Last week, McObject exhibited at the Seattle (August 24) and Vancouver B.C. (August 26) Real-Time & Embedded Computing Conferences (RTECCs). If you’re not familiar with these events, visit their website, where the About Us page says “This single-day event series is specially designed for people developing computer systems and time critical applications serving multiple industries, such as: military and aerospace, industrial control, data communication and telephony, instrumentation, consumer electronics, image processing, process control, medical instrumentation, vehicular control and maintenance, embedded appliances and more.”
Exhibiting companies at these shows are a broad mix of hardware and software vendors. Some of the exhibitors also offer 45-minute technical seminars within RTECC in which we pontificate about some aspect of technology, challenges, solutions, and/or trends (these are not supposed to be a product pitch, which vendors do a more-or-less good job of honoring). The RTECC organizers usually line up an industry heavyweight (or a panel) to deliver a lunchtime keynote.
RTECCs are held in cities across North America (> 20) and around the world (2 in Scandinavia and 5 in China) and offer a great way for embedded systems hardware and software engineers to learn, research and engage with the vendors without having to petition for travel, conference fees, a week out of the office, etc., with their employers. And RTECCs include a free lunch (at least in North America). Who says there’s no free lunch?!?
Okay, end of commercial for RTECC and on to the point of this post. We (McObject) have got a number of metrics we can look at to help us gauge business prospects. We can look at revenue over the last N months (pick a number) and see if it’s up or down from last year or last quarter, and whether it’s generally trending up or down (it’s trending up). We can look at the traffic on our website, the number of qualified leads in the pipeline, and so on. Being out there, mixing with the public, is decidedly less scientific (one of those touchy-feely things that we propeller heads are notoriously bad at), but also valuable. While I believe we can work our behinds off (and we do) and continue to grow, it’s nice to know if we’re bucking a headwind or enjoying a tailwind. If the wind is at my back, I’m more likely to stick my entrepreneurial neck out a little further and invest more aggressively in hiring additional staff, capital expenditures, and so on.
So after the Seattle and Vancouver events, here’s what I came away with. Seattle is a larger market than Vancouver, so more attendees pre-registered, and more actually showed up. But in my seminar, there were only about 10 – 12 folks sitting in. I customarily fill up the room, so that was a bit disappointing. Lower-than-expected-headcount was also mirrored in the exhibits area; we had relatively few meaningful conversations (that is, conversations with software engineers who wanted to learn about the eXtremeDB embedded database because they have a problem to solve). In Vancouver, I did have a full room for my presentation, and we had excellent conversations with folks at the exhibits. This was in spite of the fact that there were fewer attendees than in Seattle.
Some at the Vancouver event speculated that Canada is farther along in the recovery than the U.S., and/or that Canada’s economic woes were not as severe to begin with. That seems plausible. I can tell you that the Vancouver show was much better for McObject than it was in 2009. Seattle was about the same; just degrees of badness. The other vendors that I talked to there (sorry, I’m not naming names) were almost universally wishy-washy about the current business climate: not as pessimistic as before, but certainly not ebullient.
With all that said, as much as I think Seattle is the center of the universe (more specifically, Fremont), I don’t think we can extrapolate the health of the U.S. embedded systems market from this one show. Next month, we’ll be at the RTECCs in Austin and Dallas. It will be interesting to see how those shows go, generally and relative to last year.
I think what we need is a good dose of optimism and boldness. Pessimism and angst are our enemies. This is America, folks! We have the world’s largest economy. It’s okay to start that new project. It’s more than okay; it’s what we need to do. More companies having the courage to do that is what will bring us economic growth. Conversely, timidity and unwillingness to launch those projects will drag out stagnation, and that’s not good for any company, or the U.S. as a whole.
There has been some interesting recent news in our world of databases.
This morning, I picked up my August 15 issue of SD Times. In the lower right corner is a blurb about the 1.0 release of CouchDB, with the full article dominating page 3. Congratulations to the CouchDB team.
In a press release about CouchDB 1.0 that I read earlier in the week, I was surprised to see CouchDB team member Damien Katz refer to CouchDB as a "post relational" database system. Before this, I had only seen the term post relational used by Intersystems to refer to the underlying network model (aka CODASYL) database in their Caché product. (That always struck me as an odd phrase, anyway, because the network/CODASYL model pre-dates the relational model.)
That prompted me to look up "post relational" in Wikipedia. "Post relational" doesn't have its own page, but it's referenced on the "database" page (http://en.wikipedia.org/wiki/Database), where it begins "Products offering a more general data model than the relational model are sometimes classified as post-relational."
Just searching the internet for "post relational" with Google or Bing turns up (mostly) references to Caché, a couple references to Matisse, and of course the recent use of the term by Katz.
I think that term is as ill-advised as the term "NoSQL". Neither tells me anything. Both can (apparently, according to Wikipedia) mean anything from network model database systems overlaid with object orientation and SQL, to a document database system, a graph database, and so on. As an aside, some NoSQL advocates have tried to de-politicize the term NoSQL by saying that it really means "not only SQL". Right; that clears it up. And, now here comes the "YesSQL" crowd (or movement, if you want to give it instant credibility).
I think it's time to give up on ambiguous terms like "post relational" and "NoSQL". Just tell your potential customers what your technology is. We're in the information business, so give us the information - not trendy marketing terms that impart no information.
Another bit of news in the August 15 issue of SD Times covers Terracotta's release of Ehcache 2.2 (a Java caching component that Terracotta acquired last year), with a major new capability, apparently, in its ability to store more than a terabyte of data in memory. Congratulations, also, to the folks at Terracotta. Terabyte-scale in-memory storage holds great promise for a wide range of applications from social networking to data analytics to scientific research. eXtremeDB crossed the terabyte threshold in 2007. A key difference between Ehcache and eXtremeDB, however, is that eXtremeDB is written in C, and so runs at compiled code speed versus interpreted Java. Developers typically use C and C++ to create applications with our database, but through the eXtremeDB JNI, Java programmers can also take advantage of its speed and scalability while still working entirely within their programming language of choice.
The term 'embedded database' has been around since the mid-1980's. It was originally created to mean a database system that is embedded within application code. In other words, the database management system is delivered as a library that you, the developer, link with your application code (and other libraries) to create an executable. In that sense, the database system functionality is 'embedded' within your application code. Hence the name "embedded database."
Since the late 1990's, embedded database system vendors have been trying to sell their technology to developers of embedded systems. This has created a lot of (unfortunate) confusion. In the 10 or so years since, some folks have come to equate "embedded database" with "embedded systems", which has led them down a path to frustration and, in some cases, project failure.
Why? Because the vast majority of embedded databases were not written with the unique characteristics (slower CPUs, limited memory, no persistent storage, etc.) of embedded systems in mind. In fact, many embedded database systems were created in the 1980's, long before anyone considered using an embedded database in such systems (remember that most embedded systems in that era were 8- and 16-bit systems that simply couldn't address enough memory to permit use of a COTS embedded database system).
Unfortunately, some embedded database vendors haven't helped the situation. They have adjusted to changing market conditions by re-casting their embedded database products as a solution to the data management needs of embedded systems, even though their technology was not written – and, in fact, is not suited – for embedded systems. These changing market conditions include the rise of open source/dual-license products like MySQL and BerkeleyDB that became dominant players in the line-of-business client/server DBMS market and embedded database respectively, and the emergence of free entry-level RDBMS offerings from Oracle and Microsoft (SQL Server Express edition and Oracle 10g Express Edition, respectively). Faced with these challenges, vendors of proprietary, closed source, and commercial (not free) embedded database products found it increasingly difficult to compete, and sought “green fields” in the embedded systems software market for their products.
As an aside, the media recognized the situation in the early part of the last decade and, SD Times in particular, tried to popularize a new term, "application-specific database." Unfortunately, the effort didn't stick and we are still left with the term 'embedded database'.
So, back to the subject of this blog post. What is the essential attribute of an embedded database system? It is exactly what I described in the opening paragraph: The database system functionality is linked with application code and resides in the same address space. This contrasts to client/server architecture DBMS in which the database server exists as a standalone executable, accessed by client programs through an inter-process communication (IPC) and/or remote-procedure-call (RPC) mechanism.
In short, an embedded database system should exist wholly within your application's address space and not require communication with any external agent. Anything external is an immediate tip-off that the DBMS is not, in fact, wholly embedded.
As a former colleague of mine, a VP of marketing, once said to me: "What is the 'so, what' of it?" Excellent question. Why should anyone give a hoot?
Perhaps in the non-embedded systems market of embedded databases, nobody does (though even that is arguable). But in embedded and real-time systems, the "so, what?" is performance. The need to communicate with an external program, for any purpose, imposes a performance hit that few real-time/embedded systems can afford. This is true regardless of whether that external program is a lock manager, lock arbiter, dead-lock detector, or anything else.
Another "so, what?" is the introduction of dependencies on external components, notably a communication protocol like TCP/IP. Communication between the application (with the database system embedded within it) and an external component also necessarily increases the complexity, fragility, and, consequently, the potential need for administration. These dependencies might not be a big deal in line-of-business systems running on PCs and other systems running robust operating systems like Windows, Linux and Solaris and in organizations with an IT staff. But for an unattended embedded system running on a relatively modest CPU, with a simple RTOS and limited network connectivity/bandwidth, it can be a killer.
Since I am writing this blog post, it should be no surprise that eXtremeDB is an embedded database in the true sense. eXtremeDB never requires communication with an external component. We do offer remote interfaces to eXtremeDB databases through both our native and SQL APIs, and the High Availability edition requires a communication channel for synchronizing master and replica databases, and replicating transactions. But these are optional.
If you have demanding performance requirements, limited resources, and/or are developing an embedded system that absolutely, positively must run un-attended (i.e. "zero administration") then carefully consider your choice of embedded database system.
Some of us here at McObject just read this article on embedded.com titled Making the case for commercial communication integration middleware.
A lot of the suppositions in that article with respect to the case for COTS operating systems and communications middleware also ring true for database systems. From the beginning of McObject, we've recognized that RYO represents our largest "competitor".
One of the arguments put forward by Dr. Krasner is that "RYO solutions tend to be designed and implemented based on initial connectivity requirements and thus are very brittle when new requirements are introduced." The same can be said of RYO data management solutions. You have a specific need and you write a solution for that specific need. When changes come later, the original solution needs some retrofitting, which can range from minor to a major overhaul.
Dr Krasner goes on to say that "These [RYO] designs are tightly coupled...", which is true. In contrast, COTS solutions are designed from the start to solve a wide variety of database management problems, and are not designed to solve one specific problem. Consequently, they are loosely coupled.
After listing some limitations of RYO, Dr. Krasner says "Over time, as the above issues are addressed, RYO middleware ... often ends up becoming a full-blown infrastructure..." I can't tell you how many times I've seen that outcome in the context of RYO data management solutions, too. Inevitably, the organization that embraces RYO finds themselves committing ever greater resources to a custom, in-house, database management system until one day somebody realizes it and initiates a search for COTS replacement so that their staff can return their attention to their own core competence.
The balance of the article makes the case for the return-on-investment (ROI) of COTS by putting some numbers to the metrics for successful projects involving communications middleware. I'd love to see a similar study for COTS vs RYO database systems. I'd bet dollars to donuts that the case would be as, or more, compelling for COTS database systems.
McObject’s customers are more likely to express the economic benefit from using our eXtremeDB embedded database in terms of developer-weeks or developer-months saved. For example, Boeing credited eXtremeDB with saving 18 developer-months in an upgrade to the embedded software in its Apache Longbow helicopter. Another customer, IP Trade, provides a communications system for securities traders. IP Trade’s head of development pointed to 6 programmer-months saved by using eXtremeDB rather than building data management code from scratch. Presumably those calculations include both development and QA – but ease of updates, code maintenance and other downstream benefits of using a COTS database will add to the savings.
eXtremeDB 4.0, which was released last month, includes a new, cleaner, interface for creating/opening databases, especially when a database is a hybrid (in-memory and on-disk) database.
Prior to version 4.0, two functions were needed to open a hybrid eXtremeDB database: mco_db_open() and mco_disk_open(). To provide a consistent interface irrespective of whether a database is entirely in memory, entirely on disk, or a hybrid, we introduced the concept of database 'devices' to eXtremeDB. In 4.0, the approach is to define an array of structures that describe the devices a database needs, and to pass that array in as an argument to the new mco_db_open_dev() interface. For backward compatibility, in-memory database can still be opened with the legacy mco_db_open() API.
A 'device' can be a conventional memory segment, a shared memory segment, a simple file path (for the database and/or log file), a multi-file path, or a RAID-type device.
A multi-file path can be considered a virtual file consisting of multiple segments. When the first segment is full, we start filling the second one, and so on. For file systems with 2GB size limits, a multi-file device allows for on-disk databases >2GB.
RAID-style devices offer two additional capabilities: in a RAID-0 configuration (striping), database pages are scattered between RAID segments. It is assumed that each RAID segment resides on a physically separate device so that writing to two segments in separate devices can proceed in parallel. Obviously, this also requires hardware support (i.e. that there are separate controllers and I/O channels such that the read/write operations are not serialized through a single controller/channel).
A RAID-1 configuration (mirroring) is also supported. In this configuration, the database pages are written simultaneously to each device. This can improve reliability by avoiding the need to perform a restore from a previous backup in case of a disk crash, and potential loss of data if the roll-forward transaction log happened to be on the same device as the database file(s).
In summary, devices in eXtremeDB 4.0 provide for a more elegant programming interface and facilitated our ability to extend eXtremeDB functionality to support on-disk databases >2GB even if the underlying file system has a 2GB file size limit, and to support striping and mirroring of databases, which in addition to the 4.0 MVCC transaction manager has the potential to further exploit multi-core and parallel programming.
On Monday, we announced the release of eXtremeDB 4.0. Evaluation versions of the eXtremeDB In-Memory Database System and eXtremeDB Fusion (hybrid in-memory and on-disk) database system, with and without eXtremeSQL, for 32-bit Windows and Linux are available for download now.
Two central themes summarize the 4.0 release: Leveraging multi-core and expanding the already-expansive choices of APIs, index types, and more. eXtremeDB 4.0 includes new choices for the transaction manager, a new index type, a new programming interface choice, and other improvements to maximize multi-core and make it easier to work with hybrid databases.
I'll discuss the new transaction manager today, and other major features of 4.0 in the days to come.
eXtremeDB 4.0 introduces a new transaction manager to the product: the MVCC transaction manager. MVCC is an acronym for Multi-Version Concurrency Control. This is a concurrency control technique often found in our big cousins (Oracle, et al) but not in database systems for embedded systems. Until now, that is. Previous versions of eXtremeDB employed a Multiple Reader Single Writer transaction manager (we call it MURSIW and pronounce it "mer siv"). This transaction manger was, and is, fantastic for an in-memory database system for embedded systems with relatively few concurrent tasks/threads. In such a setting, the cost of complex lock arbitration is unjustifiable. Over the years, though, eXtremeDB has expanded beyond our initial target market, and has acquired new functionality (like the hybrid capability where some or all of the database is stored on persistent media). The MURSIW transaction manager doesn't always fit in these environments; there may be more concurrent threads updating the database than MURSIW can efficiently handle, or the storage media is too slow.
But, we didn't want to change the programming paradigm of eXtremeDB by introducing a lock arbiter and pessimistic locking APIs. And, given the accelerating adoption rate of multi-core systems in embedded systems, we also didn't want to implement a concurrency model that would create barriers to maximum utilization of multiple cores. So MVCC was a natural choice.
MVCC is an optimistic concurrency model. No task or thread is ever blocked by another because each is given its own copy of objects in the database to work with during a transaction. When a transaction is committed, its copy of the objects it modified are put back to the database. So no explicit locks are ever required during a transaction, and therefore there is no lock arbiter. Locks are implicity applied by the eXtremeDB run-time when the transaction commits. It is possible that two tasks will try to modify the same object at the same point in time. In this case, one task will receive an error code, MCO_E_CONFLICT. So application logic needs to account for this possibility and be prepared to re-try the transaction. Apart from this, your eXtremeDB application code doesn't change - you still wrap your database access code in between mco_trans_start() and mco_trans_commit() calls.
As I alluded to above, the MURSIW transaction manager is still part of eXtremeDB, so you can choose between MURSIW and MVCC. They are delivered as separate libraries, so you make the choice at compile-time. Which leads to the obvious questions: How do I choose, and what are the tradeoffs?
The characteristics that I described above that were the original design goals for MURSIW still make MURSIW the better choice when those characteristics are present: an in-memory database with a small number of concurrent tasks modifying the database. There is very little overhead with the MURSIW transaction manager. Read-only tasks can operate in parallel since they don't modify the database and therefore cannot interfere with each other. Read/write tasks will have exclusive use of the database for the duration of their transactions, but an eXtremeDB in-memory database is so fast that the transaction often completes faster than it would have been possible to perform a context switch to a lock arbiter, much less actually arbitrate access requests.
If there are more than a few concurrent tasks that need to modify the database, or you cannot tolerate having read-only requests blocked by a task that is modifying the database, or the storage media is anything other than RAM (i.e. a hybrid database on relatively slow HDD or SSD media) such that transactions run too long for MURSIW to make sense, then MVCC might be the better choice. MVCC carries more overhead because it has to create the copies of the objects for each task (read-only or read-write), track them, write them back in the case of a read-write transaction, and eventually discard unused objects. So if you compare MURSIW and MVCC with a single thread, for example, MURSIW will win every time because of the lower overhead. Likewise, if you compare MURSIW and MVCC when the access is largely read-only, for any number of concurrent threads, MURSIW will win. However, for concurrent write transactions, the additional overhead is quickly overcome on multi-core systems. Here are some pictures to illustrate the point.
The tests were executed on a quad-core system running Windows Vista. Each thread executed 1 million inserts and hash searches. The graphs show the time in milliseconds for the threads to complete the tasks. Smaller numbers are better. As you can see, with two and four concurrent threads writing to the database, the MVCC transaction manger overcomes the additional overhead, and with four threads attained a total throughput of over 1,600,000 inserts per second, and 4,000,000 searches on a hash index. This is a simple test for which the sole purpose is to illustrate the differences between MURSIW and MVCC.
Another characteristic of the MVCC transaction manager is a higher memory requirement because it creates copies of objects for each concurrent task. We don't see this as much of an issue, however. eXtremeDB has always been exceptionally frugal with memory, so relative to alternatives, we had room to work with. And it is expected that the systems for which MVCC will be the logical choice are not typically resource-constrained.
Have questions? Leave me a comment (you need to be registered).
Recently, I tweeted (see http://twitter.com/McGuy) a comment on the claim in this article http://www.javaworld.com/community/?q=node/3567 that "SSD performance is limited only by the SATA2 interface throughput."
Somebody replied that 250MB per second sustained read speed = 2GB/s. Add on the bus overhead and you have 3gb's (the SATA2 speed limit).
This page http://www.anandtech.com/storage/showdoc.a...i=3531&p=24 has test results that bear out the 250MB per second claim, but only just, and only in one specific test case - a sequential read of 2MB files.
Every other test conducted shows performance well south of 250MB/s. Sequential writes were 2nd best at 195 MB/s. The much more realistic randon reads and random writes were 56.5 and 31.7 MB/s, respectively.
Further, these numbers were only attained by the cream of the crop. The much more 'pedestrian' results were under 30MB/s read and under 3MB/s write!!!
There's also no mention in the article of whether the tester conducted the tests with "fresh out of the box" SSD, or they preconditioned the drives before conducting the test. This can have a huge impact on the real-life performance. See http://www.flashmemorysummit.com/English/C...torial_Amer.pdf for example.
So, I stand by my comment. SSD, in real life, are no where near pushing the SATA2 speed limit and comments to the contrary are just marketing hyperbole.