MongoDB–A NoSQL Solution Introduction For Sage ERP X3 --- Say What?

7 minute read time.

One of the beautiful things about working on Sage ERP X3 is that the technology is always improving. I simply can’t wait for version 7 to come out. There will be an amazing new set of technological gadgets to play with. We will be HTML 5 for our client. Elastic Search is a new search mechanism we’ll deliver. We’ll also have a new way of describing the screens with presentation layers that allow us to enter the mobility world. We’ve got so much to offer, so for this post, I just want to talk about one specific thing, MongoDB. Oh, and one huge caveat, everything is subject to change as this is pre-release stuff we’re talking about here!

What is MongoDB?

If you use Google’s definition, a Mongo is literally “a monetary unit of Mongolia, equal to one hundredth of a turgrik.” The Webster dictionary says the same thing elaborating that the first known use was in 1935. This has absolutely NOTHING to do with how MongoDB defines themselves. MongoDB.org says “MongoDB (from ‘humongous’) is an open-source document database, and the leading NoSQL database.”

(On a side note, I have no idea if it is capitalized either, as the mongoDB website brands themselves both ways, one with a capital M and another with lower case.)

What is NoSQL?

See the article on MongoDB’s website titled What is NoSQL for a definition. In my own terms, NoSQL is a movement in computer science that seeks to improve efficiency in data store access and provide scalability (usually horizontally). Implementations of NoSQL like MongoDB add other features like sharding, replication and more, we’ll get into that later.

Amongst one of the qualities of NoSQL is that you sacrifice the rules of normalization to improve query speed by avoiding costly joins on your system. But, along with denormalizing your data structures (tables in SQL Server), other interesting qualities such as a dynamic schema exist. This is where you can have nested data structures. Think of it like a nested table structure like a table within a table, or otherwise stated, a collection within a collection. So, instead of querying your main document type, such as a sales order, then taking the time to query your sales order lines table and join the two by key, instead, you stuff the lines into your primary data structure as a nested structure. This benefits you by avoiding having to read two tables and then use some type of join algorithm (nested loops, hash or merge joins are traditional techniques Microsoft uses in SQL Server for this as an example) to return the data to the caller as a single data set. Also, by the very fact that data structures are dynamic that means you can add a field to the structure on the fly by simply stating the field in your write statement. They are all just JSON (Javascript Object Notation) structures. In Microsoft SQL Server, in order to add a field to your table you must run an alter table statement and there can be some consequences depending on how people have written their applications against that table, so you must be careful.

All-in-all, there are some very healthy tradeoffs (and yes it pains me to say that being a staunch supporter of SQL Server) to the NoSQL movement. Evidence of this is all around you. Many of the social media sites are employing NoSQL solutions to handle the load they face (no pun intended Facebook). I have to admit, I’ve been a huge fan of the Mastering Data Modeling book written by John Carlis and Joseph Maguire for a long time now… but if you asked them, I think even they would have to acknowledge the NoSQL movement as a valid programming and data storage approach of the future.

Don’t We Use SQL Server For Our Data Store On Sage ERP X3?

Don’t be fooled, only part of Sage ERP X3 will be using MongoDB. We will use MongoDB for our new dashboard and one very big feature, SQL Requester cached data stores. SQL Server and Oracle are still embraced for transactional data storage.

Wait, You Said Requesters… I Know Where You Are Going With This: SQL Log Bloat!

Yes, you guessed it (or did you?). Today, we know that people want to consume data and transform it into information, and we provide the requester toolset for that inside Sage ERP X3 (just one of the ways we can consume information). The tradeoff is that we write the data to the ALISTER table, but we do so inside a loop. So, if you have a large number of rows returned by your requester that translates to one insert statement to ALISTER for every row returned, all of which are logged. This will create some serious hot spots for your SQL Server log file and cause log bloat. Going forward, instead of caching the data into ALISTER we’ll actually build JSON structures and store them into MongoDB. This should result in a serious improvement in terms of reducing log bloat, reducing the overall disk burden and improving speed not only in requesters but the entire application. Right now I’m told that our improvements in this area are substantial, but I haven’t tested it myself yet, so I’ll let you know when I see it. I expect amazing things, and I’m prepared to write epic poems and sing glorious songs to the developers who worked on this. But first, I do need to see it for myself.

What Other Advantages Are There To Using MongoDB Specifically?

MongoDB has quite a few features, but I’m going to let you check them out on your own. What I want to draw to your attention to is sharding and replica sets.

Sharding

Sharding “is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.” Essentially, you can distribute a collection (think of a collection like a table in SQL) across many instances of MongoDB. It is a technique of horizontal scaling. Where vertical scaling is the technique of creating a faster, stronger machine, horizontal scaling seeks to use commodity PCs to distribute the work load. So, essentially, you can parallelize your queries against a collection pretty easily, meaning your performance for querying data can skyrocket with horizontal scaling. Think about it, how can Google, Amazon, or Facebook achieve such a high amount of transactions per second? They aren’t just buying bigger Storage Area Network devices. Many of these companies are using commodity servers and creating shards of their collections to achieve this.

Don’t be concerned, Sage ERP X3 isn’t asking you to shard any of your data. This is just a feature of MongoDB. If you wanted, I supposed you could shard your requester result set data. It might benefit your performance quite a bit, but would require some training and administration costs, something Sage ERP X3 doesn’t require out of the box or ever ask you to do. Again, we don’t have version 7 out the door yet so I haven’t tested sharding requester result sets yet, but it is certainly something to consider if you want to geek out, or have a serious reason to do so.

Replica Sets

Your MongoDB can also create replica sets. Think of replica sets like database mirroring inside SQL Server. A great introduction is here.

This Seems Very Cool, Where Can I Get MongoDB Training At?

MongoDbCompletionCert

That’s my official completion certificate from the MongoDB University. As it turns out, you can get trained for free on MongoDB technology. That represents seven weeks of YouTube classes, homework and a final exam. You just need to sign up for the MongoDB fo DBAs class by going to the MongoDB University’s website here. Right now, they offer:

  • M101J: MongoDB For Java Developers
  • M101JS: MongoDB for Node.js Developers
  • M101P: MongoDB for Developers
  • M102: MongoDB For DBAs

Conclusion

MongoDB is going to allow Sage to do some exciting things, improve speed and flexibility for the Sage ERP X3 product line. It also can allow you to do some interesting things from an administration perspective, if you know what you are doing. But, the really nice thing is that, unless you really wanted to know about MongoDB, you wouldn’t ever have to deal with it directly. Just like SQL and Oracle, Sage ERP X3 handles all the CRUD operations, whether it be to SQL Server, Oracle, MongoDB or even flat files. We got you covered!