Why Greenplum?



A quick intro into the world of DW and GP...

In the new world of Internet and Technological Development, data size is on the large and for us to obtain information in a quickie like google, there needs to be massive data centres with huge processing power and large amount of data storage.

Storage has evolved over the past few years, but the stuffs thats missing is technology to process the data faster along-with the increase in storage. So with what we have we can either go for high End Processing Server with Large Memory as a single unit, but it will be expensive as well as there will be limitation for the amount of storage and processing power. So we need to find a way in processing the data faster with what is possible for a small enterprise. There comes the chance for databases like Greenplum, Teradata, Vertica, Netezza etc.

Why do I say so, because Greenplum Database works on MPP and nothing shared architecture. MPP is Massively Parallel Processing Architecture. Here Greenplum takes the advantage of commodity hardware and uses the parallelism for faster query execution and giving better Data Warehousing solution at a competitive price range. Nothing shared means each and every server that contains the data will be holding distinct data. I will get back to that once we get into the internals.


So why now Greenplum and not the others, Greenplum has a very stable product line and lot of customizations and great support in the market and it has made itself Enterprise ready and also has open-sourced its code so a lot of people could reach out for it.

Another great advantage of Greenplum is that its based out of Postgres 8.2.15, and has made its own advancements since. So it gets are larger Db users and relatively easy to learn and work on it by the Administrators and the users. Currently the Greenplum support is working on upgrading the Database line up with the latest standards in Postgresql. 

Other Advantages are its known for Faster Data Loading, for its Exquisite and Fast Pivotal Query Optimizer which generates the best query plans for complex queries, Polymorphic Storage, different Compresssion supports in Tables, High Availability, Fault Tolerance, fast backup and restores, comprehensive SQL support and Linear Scalability.

Will write more on the architecture and others as this progresses!! -VT

Comments