Friday, April 12, 2013

Open Source ETL Considerations

The Business Intelligence and Data Warehousing practice director at my former employer Xtivia posted this article and I wanted to repost it here on my blog.
_______________________________________
Lately, we at Xtivia have seen a marked uptick for interest in open source ETL tools. Each customer situation is unique, but I would like to propose a few factors that you should review before making the decision:


    Incumbent Tool Set:  Decisions around your choice of an ETL tool are affected by a lot of different things. End of the year or introductory pricing always looks attractive and purchasing decisions are made in the "heat of the moment". This works out okay for a little while but then the costs start adding up. The key is to focus on "how" the tools are being used as opposed to "which" tools.

It is a big issue if you are not leveraging the tool to the fullest extent and your developers then turn to tried and tested method of creating custom scripts/code to facilitate ETL. This further diminishes the value you can extract out of your investment.

    Changing Conditions: Change is the only constant, so why would the ETL landscape be any different? Another key factor is to check whether the tool set you are considering works with current and future data sources/destinations without adding to the costs. In other words, would you need to buy a license for Sybase IQ connector if you decide to use the database for data mart purposes?

    Support and Community Activity: Open source based tools are just that--open source based. The key is to find the right vendor who can provide enterprise grade support and the right partner who can help with best practices and may be even providing the first line of support. Also, the community that is contributing should be thriving--in other words, your best chance lies with the community that is seeing a lot of activity.

    Reliability and Functional Diversity: Key questions to answer are--Once you create a job using the ETL tool, does it function the way you designed it? Is there a "platform" to sustain the ongoing collaborative development?  How easy is to tune the jobs for performance?

It may seem overwhelming, but this is precisely what we do at Xtivia. Our tool of choice is Talend. We have worked with many other ETL product suites like Informatica and Infosphere Datastage as well. So, if you are looking to find the right partner, look no further--we can help with all aspects of data integration and specifically using the "best fit" tool for your organization.

Quick Tip on using Talend ETL

One of our consultants posted a short tip about using Talend's ETL tool and wanted to share it here.  Email me if you want to hop on the phone with the author and I.

http://blogs.xtivia.com/home/-/blogs/looping-construct-for-etl-simplified-by-talend--1?_33_redirect=null


Wednesday, October 24, 2012

Good Article on Hadoop and Big Data

There's a lot of hype about Big Data, and Hadoop usually comes up as the framework being leveraged to "solve" Big Data issues.

I found a very good, commonsense article about Hadoop that really nails some of the key discussion points.  For example, it compares relational databases and the traditional data warehouse approach with what Hadoop proposes to do.  It makes a strong point about how Hadoop addresses the matter of managing unstructured data.  It also ends with some comments about how ETL vendors such as Informatica and the open source based Pentaho and Talend are adjusting their products to work with Hadoop.

I will continue aggregating and commenting on some of the better articles in this space and contact me if you'd like to brainstorm with our data management experts at Xtivia.

Friday, October 19, 2012

Open Source Data Management Tools

After some time off from my little open source software blog, I'm back.  My involvement in open source goes back to about 2006, and for a majority of the last six years I've been interested in mainly application development components such as JBoss, Spring, Hibernate, etc. and CMS/ECM platforms such as Liferay, Drupal, and Alfresco.

However, having some data quality, MDM, ETL, BI, and data warehousing experience dating back to the late 90's and early 2000's, I have spent some time this year catching up with open source BI and data warehousing tools.

I've known about Pentaho, JasperSoft, BIRT (Actuate), and Talend over the last six years, but there's a surprising lack of good online content you can search on that is current.  I plan on posting what good information and web links that I find on here going forward.

Xtivia has been in the data management & data warehousing space for about 20 years, and has traditionally used technology from Microsoft, Oracle, IBM, Informatica, MicroStrategy, and their acquired firms for a long time, but we feel the open source alternatives have matured and should be part of the product portfolio used by our clients.

I'm looking forward to passing along new relevant information for those who find my site!  Email me and I'll set up a call with our DW practice director to brainstorm on this topic!

Monday, February 13, 2012

10 Reasons to Choose Liferay Portal

My colleague wrote a great recent post on Xtivia's blog site that is definitely worth reading if you are exploring Java-based portal solutions.  Email me directly if you have any questions or would like me to set up a call with the author!  Xtivia is a Liferay Platinum Partner and we do a variety of Liferay services such as assessments, POC's, development & co-development, full implementations, etc.

Tuesday, February 7, 2012

Alfresco Cost Savings Report Available

People are often well aware of the cost savings of using commercially-supported open source software instead of proprietary, but this study that Forrester Research did on Alfresco ECM some some large organizations shows that there are additional, more subtle reasons why Alfresco makes sense. 

http://www.alfresco.com/resources/research/forrester/total-economic-impact-of-alfresco/

This is a good summary report that will provide you some solid facts if you are evaluating Alfresco.  Xtivia can take that even further with architectural assessment consulting and proof-of-concept engagements--our Alfresco and enterprise Java expertise is available to help organizations ensure that Alfresco ECM is a great fit.  We are U.S.-based and do all of our work onshore with high quality consultants.

Thursday, August 11, 2011

Big Companies Opting For More Open Source Software Alternatives

One of my old customers is a big player in the food industry, and we did Liferay consulting and architecture work for them this summer.  Liferay Portal represents a platform for them to drive more revenue from their existing customers and serve them better--simply put, the technology is being leveraged as a strategic business weapon.

Besides Liferay, they are also taking advantage of Alfresco for enterprise document management internally.  What's next?  They are exploring other mature open source software alternatives to cut down enterprise software licensing costs from the big ISVs in areas such as CMS, ESB, messaging, BPM, etc.  Since we live and breathe open source software, we are working with them to infuse new ideas and approaches to save them money and provide more freedom over their software options.

It's very refreshing to hear about how a large organization in a mature industry with thin margins is reaping the benefits of solutions, and with our Liferay services, Alfresco consulting, Drupal (CMS) expertise, Fuse and Mule ESB knowledge, etc. we have a great fit for mutual success!