Please share with the community what you think needs improvement with IBM InfoSphere DataStage.
What are its weaknesses? What would you like to see changed in a future version?
From a practice point of view, solutions such as IBM InfoSphere DataStage and Oracle Data Integrator are losing ground, whereas open-source solutions are becoming increasingly powerful. For example, we are currently working hard on several examples, and in a few years, open-source solutions will take the lead in the market. It will be used by large enterprises. Clients are looking for open-source solutions more and more. It would be useful to provide support for Python, R, and Java.
The solution is currently lacking virtualization ability. If they were to include it, it could be a good evolution on this framework. I'd like to see an improvement in support and a more customer friendly and knowledgeable support staff.
The initial setup could be more straightforward.
The interface needs improvement. The interface in Informatica is easier than in DataStage. The licensing can be improved. Many companies are moving away from DataStage because it is expensive. The biggest issue that is unclear is how are they integrating into DevOps when they are binary files. We would like to see DataStage integrated with DevOps so that a pipeline can be created for auto-deployment. Right now we are all doing it manually.
The response time from support is slow and needs to be improved.
The product is pretty complex to set up. I think it is quite expensive. So, the set up could be simplified and the price could be brought in line.
I think that performance monitoring could be improved. I know that my colleagues don't give good monitoring. I'm not sure if it's because of the product or because they don't do it normally, but performance monitoring is an issue. I also believe integration with the cloud is not so clear. It's typically a heavy system that people install on-premise. You can install it in the cloud, but it's not so straightforward. You don't find a lot of information unless you go to the IBM cloud. I think IBM is behind in cloud strategy, we would like to put it in the cloud, but there isn't much information about that. There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for.
The mod options should be simplified. Some options on DataStage aren't working properly. The solution needs to lower its price. The template mapping could be easier. The solution should allow for compression of data.
The interface needs improvement. It is really too technical. That is the main problem. In the next release, they could offer more connectors with the new database, especially cloud databases.
The price would be the first thing I would want to change. Reduced cost would allow more customers to choose the product. It's quite expensive in relation to the cost of other similar solutions. I think it would also be helpful if the product was more adaptable to other platforms and vendors. I would also like to see an improvement in support.
The previous project was based on Microsoft SQL. It moved huge amounts of data from different data sources and DataStage to a middle stage, then moved it to Netezza. This created a bottleneck in the solution. We are trying to streamline it and create ETL processes. These will take data exactly from the data sources and move them to Netezza without using of a middle database. The volume of data is quite detailed. We are talking about records in the tens to hundreds of millions. We would be happy to see in next versions the ability to return several parameters from jobs. Now, jobs can return just one parameter. If they could return several parameters, that would be great. We would be happy if the IBM could give us more tolerance for bad networks or VPN channels, as this happens from time to time. It would be great if we could use more than one SQL operator in the Source DB connector stage. Currently, in the target DB connection stage, we can use several SQL operators, but in the Source DB connector stage we can use only one. It would be better if we could use several. Data Vault is become more popular. It would be great if it appeared in the newest versions. I would like them to have more database procedures.
I really like this tool, but the administration should be on the same client application because a lot of administration features are not on the client-side, and they usually need to have administrative access. It's quite complicated to force IT, teams, to have separate administrative access from the developers. The platform also needs more stability. It caches a lot. It crashes on the application servers that the host allows on the platform. The solution needs better online tools for data, or for sourcing data on the internet. They have InfoSphere exchange but it's not as useful for DataStage.
The features that could be better starts with the user interface. It has been getting better in the last releases and in the past few years, and I guess that they will continue to make progress on this front. But even with the improvements that they have made, it could be even better now, and really should be. I think it's a little bit difficult to use because of the interface. Being user-friendly is important for any product and they need to make this adjustment. In addition to improvements in the base user interface, I would say it would be good to incorporate more interface options for cloud-based systems.
The solution should be more user-friendly.
The documentation and in-application help for this solution need to be improved, especially for new features. By comparison, in Talend, there is help available for all of the features. One of my clients has a problem using this solution with MongoDB. In the next release of this solution, I would like to see the ability to copy and paste schemas. It would be very good because as it is now, you have to save the schema to a repository and then re-load it. It can be done in Talend, but in DataStage, it is not as good.
What do you like most about IBM InfoSphere DataStage?
Thanks for sharing your thoughts with the community!
When evaluating Data Integration tools, what aspects do you think are the most important to look for? Let the community members know what you think.
Thank you for sharing your knowledge!