We just raised a $30M Series A: Read our story

Schedule a 30-minute demo or reference call with a real user from the PeerSpot community. Available only to members that are in a buying process for this product and have contributed a review that's then published.

Dynatrace OverviewUNIXBusinessApplication

Dynatrace is #1 ranked solution in APM tools, top Mobile APM tools, top Container Monitoring tools, and top AIOps tools. IT Central Station users give Dynatrace an average rating of 8 out of 10. Dynatrace is most commonly compared to Datadog:Dynatrace vs Datadog. Dynatrace is popular among the large enterprise segment, accounting for 63% of users researching this solution on IT Central Station. The top industry researching this solution are professionals from a computer software company, accounting for 29% of all views.
What is Dynatrace?

Dynatrace has redefined how you monitor today’s digital ecosystems. AI-powered, full stack and completely automated, it’s the only solution that provides answers, not just data, based on deep insight into every user, every transaction, across every application. More than 8,000 customers use Dynatrace to optimize customer experiences, innovate faster and modernize IT operations with absolute confidence.

Dynatrace Buyer's Guide

Download the Dynatrace Buyer's Guide including reviews and more. Updated: December 2021

Dynatrace Customers

Audi, Best Buy, LinkedIn, CISCO, Intuit, KRONOS, Scottrade, Wells Fargo, ULTA Beauty, Lenovo, Swarovsk, Nike, Whirlpool, American Express

Dynatrace Video

Pricing Advice

What users are saying about Dynatrace pricing:
  • "We license it for two environments, typically all of production and all of one lower environment, usually our staging environment. If there is a downside to Dynatrace, the only thing I can think of would be the cost. If it were cheaper, I'd have it in all my environments. I don't think they're charging more than it's worth, by any means. It's just that good software costs money."
  • "It is quite costly. Dynatrace was the most expensive, compared to the other products we looked at. But it was also a lot better. If you want value for your money, Dynatrace is the way to go."
  • "Dynatrace has a place for everybody. How you use it and what your budgetary limitations are will dictate what you do with it. But it's within everybody's reach. If you're a small organization and you have a large infrastructure, you may not be able to monitor the whole thing. You may have to pick and choose what you want to monitor, and you have the ability to do so. Your available funds are going to dictate that."
  • "It's understandable to do a smaller scale initial evaluation. However, as you identify the product value, don't hesitant in your scope and scale to maximize the initial investment and your opportunity to do a bulk investment of the product."
  • "Dynatrace's pricing for their consumption units is rather arcane compared to some of the other tools, thus making forward-looking calculations based on capacity planning quite hard."
  • "The only limitation with scaling to cloud-native environments is licensing. It all depends on how many DEM units you're willing to license. The more of DEM units that you purchase, the more user data you can collect."
  • "The solution has saved us money through the consolidation of tools. With a hybrid landscape, we had multiple tools. When we consolidated, we removed four or five other monitoring tools with one. For the last ROI calculation that I did, Dynatrace was saving us up to $500,000 per year."
  • "If there are no corporate requirements to run Dynatrace Managed (operating it yourself), I would definitely go for the size option. For small and medium-sized companies, the size option is probably the cheapest one. You don't need to look into operating it. You don't need to run hardware. It is pay as you go."

Dynatrace Reviews

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
Barry Pieper
Manager, Performance Engineering at Medica Health Plans
Real User
Top 10
AI identifies all the components of a response-time issue or failure, hugely benefiting our triage efforts

Pros and Cons

  • "With Dynatrace, we have synthetic checks and real-user monitoring of all of our websites, places where members and providers can interact with us over the web. We monitor the response times of those with Dynatrace, and it's all integrated into one place."
  • "It has an integration with ServiceNow, which is great. Dynatrace creates tickets for things and its AI finds root cause. We have integrated that with our ServiceNow to generate events and incidents, so that all of our event management will be done in the ServiceNow Developer."
  • "They've leveraged those security gateways and renamed them ActiveGates, and now there are different web plugins we can run on it... Sometimes the development of those seems to be running very fast and it's not complete. They don't yet function quite as easily as the OneAgents do. But I have hopes that that's going to get better. We have tried the MQ, the Citrix, and the Oracle ActiveGate plugins. They could be sharper. It's the right direction to go. It just seems like it could be smoother."

What is our primary use case?

We're a health plan, a health insurer. We're not a big one, we have about a million members. We are growing through adding new business and we're looking to expand into the government programs: Medicare, Medicaid. Right now we provide individual and family, large corporate, self-insured, and a couple other types of health plans. 

We are headquartered in Minnesota, outside of Minneapolis. We have a data center in Minnetonka and one in another suburb. We do most of our work on-premise. We don't have much in the cloud for our core backroom applications. We use a package from a company called HealthEdge in Boston, to do our claims processing, membership, enrollment, etc.

Our main use case is application performance monitoring, right at Dynatrace's sweet spot. First, we wanted to know what the performance of our healthcare and our health claims processing system was. Then we wanted to be able to segment it by where the transaction response time is spent. We also wanted to get into the deep dive of the Java profile, because HealthEdge is a Java application that runs on several JVMs. We wanted not only to get into the Java code but to get into the SQL that's created to call into the database, which is where the response-time problems are. 

We're using Dynatrace SaaS now. It's the newest version.

How has it helped my organization?

Since we have the OneAgent feature available, we have real-user monitoring. So not only do we know the response time and availability of the synthetic route, but we know what real users experience on our website. If our service desk gets a call, which seldom happens — but let's say you, as a member, had trouble with something — we can go back and find exactly what you did and why the response was poor. We've used that many times to find errors. JavaScript errors caused by a setting in Internet Explorer were the latest ones that were annoying the members. But members don't call our service desk and say, "Hey, your website sucks." So we have to look at the data and say, "Geez, why does Internet Explorer have these huge JavaScript errors?" And then we find out.

We found an error where developers used a Google API that was supposed to find a Medicare workshop by loading a Google Map and help a member find a place where they could go to a Medicare workshop. The API had so many calls an hour and we saw that, usually, about 45 minutes after the hour, that transaction was failing. It turned out that we'd used the 1,000 allocated calls and, when, the hour turned over, it worked again. It integrates all things monitoring, from an application perspective: synthetic, real users, and Java deep-dive.

Dynatrace provides us with a huge benefit for triage because by the time a Dynatrace problem is open, AI has identified all the components and where the response-time issue is or where the failure is. It's really mindless. We don't have to try to pull out a map and figure out how the application looks. 

And Dynatrace has a feature called SmartScape. I don't use it a lot because their AI is so good that I've never had to go dig through it myself. But if I were to go through it, it would go from data centers to hosts to processes to services and applications, to show how they're all linked together. So it has a topology view. We use that sometimes when we're doing performance testing, which is something another part of my team does. They need to know which pieces are involved and this helps them know that. 

But from a day-to-day event-management and IT operations-center perspective, the Dynatrace AI is what has identified the failing component. The dashboard has all the problems. They open up these problems, which are already events in our ServiceNow environment, and these problems have the call-path and everything else laid out in them. So I've never had to dig into the Smartscape to figure out where my failure is. The Dynatrace AI has done that for me.

What we found early on in our HealthRules environment was that the response-time problems were, 99 percent of the time, in the type of SQL that we throw at the database, because the DBAs would say, "It's not the database, it's the bad SQL." Dynatrace helped us focus immediately on that and get away from: "Is it the network? Is it the server? Is this too busy?" There are all the different things that the vendor wants to throw at you. I went up to Boston to help the vendor a year or two ago. I took them right through the code and the response times and said, "Here's the piece of SQL that makes this particular function slow." Dynatrace was able to do that. We got there in minutes. They said, "Well, your server might be too busy, it might be your network," and I could say, "No, it's none of that. Here's the response time of that transaction and here is the decomposition of it. The thing runs for 13 seconds and spends 12 seconds on this one piece of SQL. I think that's where your problem is." Dynatrace was a huge help there.

The solution has decreased our MTTR by well over 50 percent and maybe by as much as 90 percent. It enabled us to identify some things, first of all. Before, it was endless war rooms, and not really an identification. Dynatrace has driven that almost to zero. When the problem is opened, we know the root cause.

As for mean time to repair, since we know what we need to repair, we can point the developer right at it. It has decreased that by 50 to 60 percent.

It has also dramatically improved our uptime. One of the biggest problems we have with the JVMs, of course, is garbage collection and memory saturation. A memory leak will develop and Dynatrace will show the memory increasing steadily. It will create a problem and they'll work on the problem proactively, and either fix it or schedule graceful downtime. If they have to shut down the environment, they can stage through the three different servers in a type of HA arrangement. So without any disruption to the client, we've been able to fix things that would have turned into major outages of the whole environment. It's a definite help on the preventive side.

In terms of time to market, the guys who work on our web portal interface, who are in-house, were early adopters of the technology on our team and learned what works and what doesn't. Dynatrace has significantly decreased their time to market. They're not really part of the development cycle, but the way they use it and the things they say about it and the reports they've made indicate that it has probably cut nearly 50 percent of the development of their portal code.

It has also helped us with consolidation of tools. We got rid of some New Relic and we got rid of some older tool which was a great, early innovator in this space, but it was acquired by CA or Microsoft. We were still paying licenses for that and were able to consolidate it. We were about to buy a network tool to help us with ACI conversion on our network side, a tool that would mainly tell us who an application is talking to on the and network. We use Dynatrace to do that, so we saved tens of thousands of dollars in not acquiring that tool. We also took the synthetic work that we paid an outsourced company to do for us and we converted all of that. Once we had Dynatrace in the house, we could do it ourselves and that saved $20,000 to $30,000 a year. There's probably more, if I were to look at it, that I could do with Dynatrace. I have to focus on the core system right now, but I think they'll get it in the SNMP monitoring space soon, if they're not already there. And the plugins on the ActiveGates have a lot of capabilities we could use. We already monitor our VMware environment with it now.

We've started to use the Apdex score in all of our communications. It's a standard metric that's used for websites to indicate how they're performing. That idea is baked into Dynatrace and we've built on that throughout our company. The weekly service quality reports that are produced and sent via email to all Dynatrace users are starting to get some notice. They show, from the web portal side, what the Apdex is. Is it acceptable, tolerating, or unacceptable? It shows the percentages of the time of use and where they're coming from. It also shows it geographically and what type of browser most of your users are using. It shows how much of it is mobile versus desktop, which has proved very valuable to our digital experience people. Things like that are a huge benefit, and those are things I didn't even know existed when I bought it.

What is most valuable?

In addition to just the monitoring of the HealthRules ecosystem — which is typical BusinessWorks, Oracle Databases, and JVMs for transactions — we do a lot of web monitoring. With Dynatrace, we have synthetic checks and real-user monitoring of all of our websites, places where members and providers can interact with us over the web. We monitor the response times of those with Dynatrace, and it's all integrated into one place. We actively synthetically monitor our websites from two or three geographic locations. Our business is in nine States, so we're not international by any means. We sell health insurance to members in Oklahoma, Kansas, North and South Dakota, Wisconsin, Minnesota. We monitor those synthetically.

It also instruments .NET, and BusinessWorks out-of-the-box.

It has an integration with ServiceNow, which is great. Dynatrace creates tickets for things and its AI finds root cause. We have integrated that with our ServiceNow to generate events and incidents, so that all of our event management will be done in the ServiceNow Developer. We're working on that now. In terms of the self-healing aspect, we don't use Dynatrace to do that, although we could. We've gone down the path of trying to use ServiceNow's Orchestration. But we may come back to Dynatrace for that, depending on how that works.

In addition to ServiceNow, there is a CMDB integration, so when a Dynatrace problem is discovered, the Dynatrace ID correlates to a CMDB and that's how we open an incident or event. We don't need to do the correlation. If an event turns into an incident, then the correlation is done automatically with the Dynatrace ServiceNow application, which is in the ServiceNow store. It syncs up the CMDB's entries, the CIs, with the Dynatrace IDs so that all of the different pieces of the response-time puzzle that Dynatrace has, can be assigned to a CI in our CMDB. We are actively working on improving our discovery in CMDB, as it's not the most robust. Dynatrace is a huge help there because the OneAgent discovers all these things for us. So it helps with ServiceNow discovery as well.

The Dynatrace panel generally lets you know how many users it affects, and how many transactions or events in that application it affects. We don't use that a lot. That's beyond our capability right now, but I don't see any reason why it wouldn't be quite useful to assign severity from that.

What needs improvement?

Around the way licensing works, I would like to put it everywhere in infrastructure-only mode and I want it to be reasonable to do that.

From a technological standpoint, there is the OneAgent versus plugins they have. They called them security gateways when they first came out. They're the way that the OneAgents talk to local active gates, which communicate out to the Dynatrace cloud to store all the performance data. Instead of every agent going out to the cloud, there's just one spot and security likes that. But they've leveraged those security gateways and renamed them ActiveGates, and now there are different web plugins we can run on it. Sometimes the plugins are designed for things where you put in an agent, Like an Oracle instance of Exadata, or an Oracle appliance. We can't put a OneAgent on that. It's not a standard Linux or Windows OS, so the ActiveGate solution is better there. Sometimes the development of those seems to be running very fast and it's not complete. They don't yet function quite as easily as the OneAgents do. But I have hopes that that's going to get better. We have tried the MQ, the Citrix, and the Oracle ActiveGate plugins. They could be sharper. It's the right direction to go. It just seems like it could be smoother.

For how long have I used the solution?

I have been using Dynatrace for close to three years in my current company, and before that I used the earlier versions of Dynatrace, DC RUM, at a previous job.

What do I think about the stability of the solution?

I had one problem early on with WebLogic where Dynatrace was not stable and it would actually affect the ability of one of the WebLogic components. It was instrumented because we thought we needed it to be, but it didn't need to be. When we decided not to instrument it the problem went away. 

But that's the only stability issue I've ever had with it. That was the only time it's caused an outage or been responsible for high resource consumption. Typically the OneAgent is well under 1 percent CPU utilization and takes very little memory.

It's used constantly by several teams. They use the Dynatrace mobile app on their phones to get notified of problems in the environment before ServiceNow even notifies them. Our platform services team, which is the team responsible for the HealthEdge environment — if we were a bank, it would be all the backroom functions. It is where you pay claims, enroll members, credential providers and maintain all that stuff. That support team has it on their phones. Our portal team also has the mobile app, so it's used constantly. I hear about it when it's not available, or if there's something odd going on with the mobile app.

What do I think about the scalability of the solution?

It could handle a much larger environment. I add ActiveGates mainly for redundancy. I don't think I need as many as I have. I could scale it out very large. I don't see any limitations. I've never had a problem with that other than my checkbook.

We've tried scaling it to cloud-native environments a little bit. We have a few things that are off-premises, like Microsoft Dynamics and Salesforce, which are in the cloud. We have a cloud-based application that does provider credentialing, as well. We don't have anything that we own in the cloud, so we can't instrument AWS or anything like that with it.

How are customer service and technical support?

Tech support has generally been pretty good. We get good response. They have a thing called Dynatrace ONE and I find the tech support to be best if I engage it through a chat window on Dynatrace. There's a place, right in the tool, where you can get a hold of a Dynatrace ONE person and they'll look at your problem right away. That seems to work better than the old model of calling support or sending an email, because you would go back and forth. "Send me more doc. What about this? Send me that." The Dynatrace ONE agent gathers everything he needs and, once he has all that, if he doesn't know what the problem is, at something like a level-one triage, he'll open the incident for you and it's done. I like that part. The traditional send-them-an-email, open-a-ticket-online takes too long. The Dynatrace ONE agent available through chat is a great concept. I encourage my team to use that rather than opening a problem. And that's included in the standard licensing.

How was the initial setup?

For our deployment, we did the first 40 in less than an hour. That required a part of one guy, and he maintains it all now. We have close to 200 nodes with OneAgent on them and four ActiveGates, synthetic monitoring, and plugins for MQ and Citrix, among other things. That takes three-fourths of a person on my team. I've federated the support for a lot of the stuff on our portal side. Our portal team developers fell in love with it so much that I just let them run with it and install it as needed. I give them more and more administrative rights. If you add their time, it works out to the equivalent of about a person.

We have close to 100 users. Some of them are just management who use the reports. Some of them are the portal team who are administrators, just like my team, and the majority are in IT. We're starting to take it out to our sales organization, as they're interested in the response time and other things.

What was our ROI?

We see ROI in performance tuning — improving application performance — big-time. We have teams using it constantly to make our digital experience better, performance-wise and availability-wise. Another part of my group is load testing. They use it as they do their load tests. They use LoadRunner to build a load test and use Dynatrace to monitor after every new release of the HealthRules code to tell them what's better and what's not. There is a huge ROI on load testing and performance testing.

There is also incident response, preventative incident response. We even had the CIO come into my boss's office one day and he was able to say that Dynatrace saw a problem and it was fixed and we didn't have an outage. And he looked at him and said, "That's how it's supposed to work, right?" What the CIO had been promised for 10 years, he finally actually saw an instance of it "in the wild" where we preemptively discovered a problem and fixed it. That's a huge win.

Also, reporting and analytics — to know what the response time is, and how many users use it, just the simple things — are huge.

I'm not sure how to estimate how much Dynatrace has saved us overall. But it's had to have saved us on the order of millions.

What's my experience with pricing, setup cost, and licensing?

We license it for two environments, typically all of production and all of one lower environment, usually our staging environment. If there is a downside to Dynatrace, the only thing I can think of would be the cost. If it were cheaper, I'd have it in all my environments. I don't think they're charging more than it's worth, by any means. It's just that good software costs money.

They have the OneAgent which you buy and install. You can run that in infrastructure-only mode and pay less. The cost is a bit funny, it's calculated based on the memory size of the server you put it on. Sixteen gigabytes of memory, for instance, is one host unit and a host unit costs you, say, $1,000. (I don't recall what the actual cost is, I'd have to look at our contract). There's a switch they've added for infrastructure-only mode, which will cut that cost to about one-sixth or one-seventh of the cost of a full host agent. You won't get the deep-dive response time metrics, but you'll get the infrastructure stuff, which sometimes is all you want.

In addition to the host agent fee, which was the first thing I bought, based on the memory size of the server, the other is in metrics that we collect through the ActiveGate plugins. They charge you per metric.

So the three principle things they charge you for are OneAgent, how many metrics you collect through the ActiveGate, and digital experience monitoring units, or DEM units. Those are basically the cost of the synthetic things, per test. Those things are quite reasonable in cost. The biggest cost is the OneAgent.

The cost to get us up, my first allocation, was under $100,000. My first PO was for about $60,000 and it covered almost our whole production HealthRules environment. We started out with 40 host units and we've grown to 200-plus, and we're a small place. Down the street is a health-related business and I think they have 20,000 host units.

Which other solutions did I evaluate?

We started by looking at industry reviews and selected the top four or five up in the upper-right quadrant: Dynatrace, AppDynamics, New Relic, and we had a brief look at what at that time was a CA product, or it might've been BMC.

We evaluated the four of them on paper and then brought two in for a trial, a proof of concept: Dynatrace and AppDynamics. Ultimately we selected Dynatrace.

There were several advantages to Dynatrace. Dynatrace was new. Its presence in the cloud was nice, but I could also run it on-prem if I wanted to and, at the time I didn't know which way I was going to go — which way I'd be allowed to go by security. AppDynamics was cloud-only at the time.

For installation, Dynatrace was trivial compared to AppDynamics. AppDynamics had an engineer onsite for two or three weeks and they still couldn't meet all of our use cases, which were pretty simple. I did them first. Then I went to Dynatrace and they said, "Well, download it, install it, and call us If you have any questions." And I thought, "Well, geez, don't I get any hand holding or anything?" It turned out that it was because I didn't need it. It was that simple. You download it, install it, and it injects itself. You can control it. It was just engineered for ease of use, by far. So the installation was night-and-day different. 

We have a lot of TIBCO BusinessWorks code around that that we wanted to instrument, and with AppDynamics we had to go into every business process and change the startup. We had hundreds of them and that was a real pain. We had to select which ones and do the work, whereas with Dynatrace, it would discover. Dynatrace has a concept called OneAgent, which you install on the server and it discovers things that you can monitor. You just click on them and say, "I want these monitored," or "Don't monitor these." It takes care of all that work and that was a huge difference. I didn't need a huge staff to maintain it. I didn't need a lot of time from the support teams — because they don't have it — to help me with monitoring. We were able to do the monitoring ourselves.

Then, once it was up and running, the use cases were pretty simple. One was to create a business-level dashboard of response time, and I don't think AppDynamics ever got that out for me. 

Dynatrace is easy to use from that perspective. It's easy to install and maintain. I have a small team and one person is my Dynatrace SME, but he does other things as well, so it's not even a full-time job.

What other advice do I have?

I've been doing this for close to 30 years. I've worked for software vendors and I've worked for major companies and now I'm at this small healthcare organization. The "holy grail" has always been the ability to decompose response time and Dynatrace has done that and integrated all of my APM needs in one tool. That is the biggest benefit to me. I can do application performance, from web to Java deep-dive, in one place. That's probably why it costs so much.

If you're thinking about Dynatrace, consider how easy it is to install and maintain. It has broad coverage and it's easy to use. I don't know how the rest of the market even competes anymore; it must be on cost.

As an APM tool, I'd probably rate it at nine out of ten. There are a few rough edges, but I think that's mainly because they're trying to do the right thing too fast.

Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
TR
User at a financial services firm with 10,001+ employees
Real User
Top 20
Helps us resolve incidents much faster, on both the front-end and the server-side

Pros and Cons

  • "Dynatrace is a single platform. It has all these different tools but they are actually all baked into the OneAgent technology. Within that OneAgent... you have the different tool sets. You have threat analysis, memory dumps, Java analysis, the database statements, and so on. It's all included in this OneAgent. So the management is actually quite easy."
  • "The solution's ability to assess the severity of anomalies based on the actual impact to users and business KPIs is great. It's exactly what we need. The severity impact is based on the users, the availability, and the impact it has on your business."
  • "The solution's ability to assess the severity of anomalies based on the actual impact to users and business KPIs is great. In my opinion, it could be extended even more. I would like it to be more configurable for the end-user. It would be nice to have more business rules applicable to the severity. It's already very good as it is now. It is based on the impact on your front-end users. But it would be nice if we could configure it a bit more."
  • "Another area for improvement is that I would like the alerting to be set up a little bit more easily. Currently, it takes a lot of work to add alerting, especially if you have a large environment, and I consider our environment to be quite large. The alerting takes a lot of administration."

What is our primary use case?

We use it to follow up user experience data. It's all banking applications. For example, when you're viewing your account, you open up your mobile app and the click you do to view your account is measured in Dynatrace. It's stored and we are checking the timing at each moment. 

We are also following up the timing differences between our different releases. When we have a new version release, we are already checking within our test environment to see what the impact of each change is before it goes to production. And we follow that up in production as well.

In addition, we are following up the availability of all our different systems. 

And root cause analysis is also one of the main business cases.

So we have three main use cases:

  1. To follow up what's going on in production
  2. Proactively reacting to possible problems which could happen
  3. Getting insights into all our systems and seeing the correlation between these different systems and improving, in that way, our services to our end users.

We use the on-prem solution, but it's the same as the SaaS solution that they are offering. They have Dynatrace SaaS and Dynatrace Managed, and our is the Managed. Currently we're on version 181, but that changes every month.

How has it helped my organization?

The dynamic microservices for Kubernetes is really value-added because there is a lot of monitoring functionality already built into Kubernetes Docker. There are also free things like Prometheus which can display that. That's very good for technical people. For the owner of the pod itself, that's enough. But those things don't provide any business value. If you want business value from it, you need to extract it to a higher level, and that's where you need the correlations. You need to correlate what is between all these different services. What is the flow like between the services? How are they interconnected? And that's where Dynatrace gives added value. And the fact is that you can combine these data, which are coming from Kubernetes, and include them in Dynatrace, meaning you have a single pane of glass where you can see everything. You can see the technical things, but you have the bigger business value on top of it, as well.

Before Dynatrace, we were testing just by trying out the application ourselves and getting a feeling for the performance. That's how it very often would go. You would start up an application and it was judged by the feeling of the person who was using it at that moment in time. That, of course, is not representative of what the actual end-user feeling would be. We were totally blind. We actually need this to be able to be closer to the customer. To really care about the customer, you need to know what he is doing. 

Also, incidents are resolved much faster by using Dynatrace. And that's for front-end, because we actually know what is going on. But it's also for server-side incidents where we can see the correlation. Using this solution our MTTR has been lowered by 25 percent. It's pinpointing the actual errors or the actual database calls, so it goes faster. But, of course, you still have to do it. It still needs to be implemented. It doesn't do the implementation work for you.

Root cause detection, how the infrastructure components interact with each other, helps. We know what is going wrong and where to pinpoint it. Before, we needed to fill a room with all the experts. The back-end expert would say, "I'm not seeing anything on the back-end." And the network expert would say, "I'm not seeing anything on the network." When you see the interaction between the different aspects, it's immediately clear you have to search in your Java development, or you have to search in your database, because all the other ones don't have any impact on the performance. You see it in Dynatrace because all the numbers are there. It really helps with that. It also helps to pinpoint which teams should work on the solution. In addition to the fact that it's speeding up the process of finding your root cause, it's also lowering the number of people who need to pay attention to the problem. It's just a single team that we need to work on it. All the rest can go home.

It has decreased our mean time to identification by 90 percent, meaning it only takes us one-tenth of the time it used to, because it immediately pinpoints where the problem is.

Dynatrace also helps DevOps to focus on continuous delivery and to shift quality issues to pre-production because we are already seeing things in pre-production. We have Dynatrace in our test environment, so we have a lot of extra information there, and DevOps teams can actually work on that information.

Finally, in terms of uptime, it's signaling whenever something is down and you can react to the fact that it is down a lot faster. That improves the uptime. But the tool itself, of course, doesn't do anything for your uptime. It just signals the fact that it's down faster so you can react to it.

What is most valuable?

The most valuable aspect is the fact that Dynatrace is a correlation tool for all those different layers. It's the correlation from the front-end through to the database. You can see your individual tracks.

One of the aspects that follows from that is the root cause analysis. Because we have these correlations, we can say, "Hey it's going slow on the server side because a database is having connection issues," for example. So the root cause is important, but it's actually based on the correlation between the different layers in your system.

Dynatrace is a single platform. It has all these different tools but they are actually all baked into the OneAgent technology. Within that OneAgent — which is growing quite large, but that's something else — you have the different tool sets. You have threat analysis, memory dumps, Java analysis, the database statements, and so on. It's all included in this OneAgent. So the management is actually quite easy. You have this one tool, and you have server-side and agent-side which are ways of semi-automatically updating it. We don't have to do that much management on it. Even for the quite large environment that we have, the management, itself, is quite limited. It doesn't take a lot of time. It's quite easy.

The solution's ability to assess the severity of anomalies based on the actual impact to users and business KPIs is great. It's exactly what we need. The severity impact is based on the users, the availability, and the impact it has on your business.

We also use the real-user monitoring and we are using the synthetic monitoring in a limited way, for the moment. We are not using session replay. I would like that, but it's still being considered by councils within the company as to whether we are able to use it.

We are using synthetic monitoring to measure the availability of one of our services. It's a very important service and, if it is down, we want business to be notified about this immediately. So we have set up a synthetic monitor, which is measuring the availability of that single service each minute. Whenever there is a problem, an incident will be immediately created and forwarded to the correct person. This synthetic monitoring is just an availability check in HTTP. It's actually a browser which is calling up a page and we are doing some page checks on this page to be sure that it is available. Next to the availability, which the synthetic monitoring gives us, we also measure the performance of this single page, because it's very important for us that this page is fast enough. If the performance of this single page degrades, an incident is also created for the same person, and he can respond to it immediately.

Real-user monitoring is a big part of what we are doing because we are focusing on the actual user experience. I just came from a meeting, 15 minutes ago, where we discussed this issue: a slowdown reported by the users. We didn't see anything on the server side but users are still complaining. We need to see what the users are actually doing. You can do that in debug tools, like Chrome Debugger, to see what your network traffic is and what your page is doing. But you cannot do that in production with your end-users. You cannot request that your end-users open their debug tools and tell you what's going on. That's what Dynatrace offers: insight like the debug tools for your end-user. That's also exactly what we need.

Most of the problems that we can respond to immediately are server problems, but most of the problems that occur, are front-end problems, currently. More and more, performance issues are located on the machine of the end-user, and so you need to have insight into that. A company of our size is obliged to have insight into how its actual users are doing. Otherwise, we're just blind to our user experience.

Dynatrace also provides a really nice representation of your infrastructure. You have all your servers, you have all your services, and you know how they communicate with each other.

What needs improvement?

While it gives you a good view of all the services that are instrumented by Dynatrace — which is good, of course, and that's what it can do — in our case, our infrastructure is a lot bigger than the part that is instrumented by Dynatrace only. So we only see a small part of the infrastructure. There are a number of components which are not instrumentable, like the F5 firewalls, switches, etc. So it gives a good overview of your server infrastructure. That's great, we need that. But it's lacking a bit of network segmentation and switches. So it's not a representation of your entire infrastructure. Not every component is there.

The solution's ability to assess the severity of anomalies based on the actual impact to users and business KPIs is great. In my opinion, it could be extended even more. I would like it to be more configurable for the end-user. It would be nice to have more business rules applicable to the severity. It's already very good as it is now. It is based on the impact on your front-end users. But it would be nice if we could configure it a bit more.

Another area for improvement is that I would like the alerting to be set up a little bit more easily. Currently, it takes a lot of work to add alerting, especially if you have a large environment, and I consider our environment to be quite large. The alerting takes a lot of administration. It could be a lot easier. It would not be that complicated to build in, but it would take some time.

I would also like the visual representation of the graphs to be improved. We have control of the actual measures which are in the graphs, but we are not able to control how the axes are represented or the thresholds are represented. I do know that they are working on that.

For how long have I used the solution?

I have been using the Dynatrace AppMon tool for six years and we changed to the new Dynatrace tool almost three years ago.

What do I think about the stability of the solution?

We haven't had any issues with the stability of Dynatrace, and it's been running for a long time. We use the Managed environment, so it's an on-prem service, but it's quite stable. We are doing the updates pretty regularly. They come in every month but we are doing them every two or three months. First we do them in the test phase and then in the production phase. But we have not experienced any downtime ever.

What do I think about the scalability of the solution?

For us, Dynatrace is scalable and we haven't seen any issues with that. We did need to install a larger server, but that's because we have a managed environment. You don't have that problem if you go with the SaaS environment. We don't see any negative impact on the scale of our products, and we are already quite large. It's quite scalable.

In terms of the cloud-native environments we have scaled Dynatrace to, we are using Dynatrace on an OpenShift platform, which is a Docker Kubernetes implementation from Red Hat. We have Azure for our CRM system, which Dynatrace monitors, but we are not measuring the individual pods in there as it is not a PaaS; it's a SaaS solution of course.

As for the users of the solution, we make a distinction between the users who are deploying stuff and those who are managing the Dynatrace stuff. The latter would be my team, the APM team, and we are four people. The four people are installing the Dynatrace agents, making sure the servers are alright, and making sure the management of the Dynatrace system itself is okay.

The users of the tool are the users of the different business cases. That includes development and business. There are about 500 individual users making use of the different dashboards and abilities within Dynatrace. But we see that number of users, 500, as a bit small. We want to extend that to over 1,000 in near future. But that will take some advertising inside the company.

How are customer service and technical support?

I use Dynatrace technical support on a daily basis. They have a live chat within the tool and that comes for free with the tool itself. All 500 of our users are able to use this chat functionality. I'm using it very frequently, especially when I need to find out where features or functionalities are located within the tool. They can immediately help you with first-line support for the easy questions and that saves you a lot of time. You just chat and say, "Hey, I want to see where this setting can be activated," and they say, "Just click this button and you will be there." 

For the more complex questions, you start with tickets and they will solve them. That takes a little bit longer, depending on how complex your question is. 

But that first-line support is really a very easy way to interact with these people, and you get more out of the tool, faster.

Which solution did I use previously and why did I switch?

We purchased the Dynatrace product because we had some issues with our direct channels, our customer-facing applications. There were complaints from the customer side and we couldn't find the solution.

There were also a number of our most important applications that needed more monitoring. We had a lot of monitoring capabilities on the server side and on the database side, but the correlation between all these monitoring tools was not that easy. When they came up with a problem they would say, "Hey, it's not the mainframe, it's not the database, it's not the network." But what was it? That was still hard to find out. And we were missing some monitoring on the front-end. The user experience monitoring was lacking. We investigated a number of products and Dynatrace came out as the best.

How was the initial setup?

We kind of grew into Dynatrace. Our initial scope was quite small, so it was not that complex. Currently, our scope is a lot broader, but it is not complex for us because we have been working with the tool for such a long time. Overall, it's quite straightforward. If you're starting with this product from scratch and you have to find out everything, it can take some time to learn the product. But it's quite straightforward.

We started with the AppMon tool, which was the predecessor to the current tool. Implementing that went quite fast because it was a very small scope. When we changed to the Dynatrace Managed it took us half a year. And that's not including the contract negotiations. That was for the actual implementation: Finding out all business cases and all the use cases that we had, transforming them into the new tool, and launching it live for a big part of our company. That took half a year.

What about the implementation team?

We hired some external experts from a company in Belgium, which is called Realdolmen. They really helped us in the implementation. They had experience in implementing Dynatrace for other companies already, so that really helped. And I would advise that approach. If you're doing it all by yourself, you are focusing on what your problems are, while if you are adding an external person to it, who is also an expert in the product itself, he will give you insights into how the product can benefit you in ways you couldn't have imagined.

What was our ROI?

The issue of whether Dynatracec has saved us money through consolidation of tools is something we are working on. There are a number of things that we are replacing now by things that are already present in Dynatrace. If you currently have a lot of different tools, it will save you money. But Dynatrace is not the cheapest tool. Money-saving should not be your first concern if you buy Dynatrace.

It depends on your business case, but as soon as you are at a reasonable size and you have different channels to connect within your company — mobile and web and so on — you need to have a view into your infrastructure and that's where Dynatrace provides real benefits. It's not for a simple company. It's not for the bakery store around the corner. But as soon as you hit a reasonable size, it gives enough added value and it's hard to imagine not having it or something comparable.

"Reasonable size" depends a bit on your industry. But it is connected with the number of customers you have. We have about 25,000 concurrent customers, at a given moment in time. As soon as you have more than 1,000 concurrent customers, you need this tool to have enough analysis power. It gives you power for tracking the individual user and it gives you the power to aggregate all the data, to see an overview of how your users are doing. This combination really gives you a lot of benefits.

What's my experience with pricing, setup cost, and licensing?

It is quite costly. Dynatrace was the most expensive, compared to the other products we looked at. But it was also a lot better. If you want value for your money, Dynatrace is the way to go. 

Which other solutions did I evaluate?

In my opinion, the product is extremely good and comparable. We did compare it to AppDynamics and New Relic and we saw that Dynatrace is actually the best product there is. If you are looking for the best, Dynatrace will be your product.

What other advice do I have?

The biggest lesson that I have learned from Dynatrace is that application performance monitoring is very complex, but the easiest part of it is the technical aspect. The more complex thing is all the internal company politics around it. We see a lot of data and if you are targeting some people and say, "Hey, your data bridge is going slowly," they will respond to it very defensively. If they have their own monitoring tools, they can say, "Oh no, my database is going very fast. See my screen is green." But we have the insights. It's all data, and gathering the data is the technical aspect. That's easy. But then convincing people and getting people to agree on what is obvious data is far more complex than the technical aspects.

The way to overcome that is talking. Communication is key.

I'm a little bit skeptical about the self-healing. I have heard a lot about it. I have gone through some Dynatrace instances where they have this self-healing prophecy. I think it's difficult to do self-healing. We are not using it in our company. There is a limited range of problems that you can address with it. It's only if you definitely know that this solution will work for this problem. But problems are always different, every time. And if you have specific knowledge that something will work if a particular problem arises, most of the time you can just avoid having the problem. So I'm a little bit skeptical. We are also not using it because we have a lot of governance on our production environment. We cannot immediately change something in production.

We are using dynamic microservices within a Kubernetes environment, but the self-healing is a little bit baked into these microservices. It's a Docker Kubernetes thing, where you have control over how many containers or pods you want to spin up. So you don't need an extra self-healing tool on top of that.

In terms of integrating Dynatrace with our CI/CD and ITSM tools, we are working on both of those directions, but we are not there yet. We have an integration with our ITSM tool in the sense that we are registering incidents from Dynatrace in our ServiceNow. But we are not monitoring it as a component management system.

We are not doing as much as I would want to for these Quality Gates. That can be improved in our company. Dynatrace could help with that, but I would focus on something else like Keptn, or something else that integrates with Dynatrace, to provide that additional functionality. Keptn would be more suitable for that, than the Dynatrace tool itself, but they are closely linked together. For us, that aspect is a work-in-progress.

I would rate Dynatrace a nine out of 10, because it has really added value to my daily business and what I have to do in performance analysis. It can be improved, and I hope it will be improved and updates will be coming. But it's still a very good tool and it's better than other tools that I have seen.

Which deployment model are you using for this solution?

On-premises
Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Learn what your peers think about Dynatrace. Get advice and tips from experienced pros sharing their opinions. Updated: December 2021.
554,873 professionals have used our research since 2012.
Mark Kaplan
Senior Director IT at BARBRI Inc.
Real User
Top 5Leaderboard
Gives us very deep visibility into both user actions and systems interactions, including a view inside containers

Pros and Cons

  • "The Session Replay not only allows us to watch the user in 4K video, but to see the individual steps happening behind the scenes, from a developer perspective. It gives us every single step that a user takes in a session, along with the ability to watch it as a video playback. We can see each call to every server as the user goes through the site. If something is broken or not running optimally, it's going to come up in the Session Replay."
  • "I would love to see Dynatrace get more involved in the security realm. I get badgered by so many endpoint protection companies. It seems like a natural fit to me, that Dynatrace should be playing in that space."

What is our primary use case?

When we started with Dynatrace we were an on-prem organization. We used it in the early days as an APM, the way most people used it.

Our usage of Dynatrace has grown over the years, not as much in terms of capacity as in usability. It is now used by three departments within our organization. It originally started with just my group, which is IT, and then we rolled it out to development because they saw the advantages of being able to identify code bottlenecks in existing code. We've rolled it out to operations and they use Session Replay to troubleshoot customer-specific issues. And the sales department also uses it to gauge productivity and how many visits we get to a particular page, how many times people watch a particular video, how many take a certain practice exam, etc. 

Those use cases are all in addition to its core use, which is to help us keep our infrastructure running. 

We're currently using the Dynatrace SaaS, the Dynatrace ONE product. We're not using anything in the old, modular product. It fits very well for us. We are a cloud organization. We're all Azure now. We migrated from on-prem to cloud about three years ago.

How has it helped my organization?

The automated discovery and analysis definitely help us to proactively troubleshoot production and pinpoint underlying root cause, both from a code perspective as well as an infrastructure perspective. When we get an alert, or we're seeing a degradation in performance, Dynatrace will lead us down the path: Where do we need to look first? It will tell us that it has analyzed so many trillions of dependencies and that it thinks that the problem is "here," and it will point to a query or a line of code or perhaps to a system or to a container that is not functioning properly. Depending on what the problem is, it saves us an enormous amount of time in troubleshooting and identifying problems.

I estimate it has cut our mean time to identification at least in half, if not more. Before, we were relegated to combing through logs. We would take Splunk, look for the error, find out where it was occurring, how many times it was occurring — do all that type of investigation that you normally need to do. We don't have to do that anymore because it's all automated. 

And as far as decreasing our mean time to repair goes, it's closer to 60 to 70 percent. The reason is that we don't need to take such drastic troubleshooting time. We take its recommendation, and the time that we spend is checking that Dynatrace was right. We'll test out a quick fix in dev and then take it to QA and then push it to production. In some instances, it does reduce our MTTR by anywhere from 60 to 70 percent, although it really depends on the problem.

I operate an entire stack on four people, and the only way I'm able to do that is by automating as much as I can and having tools that I can rely on to reduce time-dependent tasks. Dynatrace has allowed me to function and keep my people productive without working them 24/7. Dynatrace works 24/7 for me.

Another thing that Dynatrace gives us is very deep visibility, not only into user actions but systems interactions. How are the systems relating to each other? Are the right systems talking to the right systems? When we first deployed Dynatrace five years ago, it showed us, through its Smartscape tool, that we had servers talking to servers they shouldn't be talking to. That was quite an eye-opener. I've noticed that a lot of companies are trying to copy what Dynatrace came out with in its Smartscape, but to me, it is the best visualization tool of your app stack and network that you'll ever put together, and you don't have to do anything. The system puts all that together. You deploy your one agent, it maps out the system, and you can see everything from application to network to infrastructure connectivity. It depends what you want to see, but it's all Smartscape'd out. You can tell what traffic is going in which direction and where it's going.

In addition, when I first started using Dynatrace, I had a routine. I would come into the office early and go through all of the night's activities. I would check for any problems we had: Was anything broken, were there any active alerts? With Dynatrace Davis, I started getting those reports automatically, through Amazon Alexa, and I do that on my drive to work. Instead of having to go in early and spend time in the office, I'm able to stay at home a little later, have breakfast with the family. Then, when I'm in the car, I invoke Alexa to give me my Dynatrace morning report, which will include my Apdex rating, any open problems, and a summary of closed problems. It's probably one of the least advertised aspects of Dynatrace, and one which I think is among the most highly efficient tools that they offer.

The amount of time we have to devote to maintaining Dynatrace is next to nothing. The time that we spend in Dynatrace is actually using it. We're using it to look at what's happening, what's going on, is something broken, or do we have an alert? We go in to find out what's wrong. Maintaining it is really almost nonexistent.

Another advantage is that it is much more of a proactive tool than it is one for putting out fires. Of course, it helps us tremendously if we have to put out a fire, but our goal is to never have a fire. We want to make sure that any deployments that we put out are fully tested in all aspects of use, so that when things are fully deployed, there isn't any need for a rollback. In the last three years, we've had to roll back a production deployment once. I don't attribute that all to Dynatrace, but I attribute a large part of it to it.

It has increased our uptime because we find out about problems before they're problems. The one goal that my team has, above anything else, is to know about problems before the customer does. If the customer is telling us there's a problem, we have failed. We are so redundant and so HA-built, that there is absolutely no reason for us not to be able to circumvent an issue that is under our control, and to prevent any type of a work stoppage or outage. We can't help it if the internet goes down or if Microsoft has a core problem, but we can certainly help by making sure that it's not our application stack or our infrastructure. I would estimate our uptime is better by at least 20 percent.

In the end, it has decreased our time to market with new innovations and capabilities, because anything that reduces time-to-produce decreases time to market. Once the code has actually been developed, it's in testing and deployment and that's where my window of efficiency is. I can't control how long it takes to build something, but I can control how long it takes to fully test it and deploy it. And there, it has saved us time.

Before we had Dynatrace, and a lot of the processes that Dynatrace has helped us put into place, everything was manual. And the more manual work you have, the more margin for human error you have.

What is most valuable?

The most valuable features really depend on what I'm doing. The most unique feature that Dynatrace offers, in my opinion, is Davis. It's an AI engine and it's heavily integrated into the core product.

The Session Replay not only allows us to watch the user in 4K video, but to see the individual steps happening behind the scenes, from a developer perspective. It gives us every single step that a user takes in a session, along with the ability to watch it as a video playback. We can see each call to every server as the user goes through the site. If something is broken or not running optimally, it's going to come up in the Session Replay. 

We also use the solution for dynamic microservices within a Kubernetes environment. We are in the process of converting from Docker Swarm to Kubernetes, but that is in its infancy for us and will grow as our Kubernetes deployments grow. Dynatrace's functionality in this is really good. 

We use JIRA as well as Jenkins. We have a big DevOps push right now and Dynatrace is an integral part of that push. We're using Azure DevOps, and tying in Dynatrace, Jenkins, and JIRA and trying to automate that whole process. So Dynatrace plays a role in that as well.

In terms of the self-healing, we use the recommendations that it provides. I'd say the Davis engine runs at about 90 percent accuracy in its recommendations. We have yet to allow automated remediation, which is our ultimate goal. It's going to be a bit before we get comfortable with anything doing that type of automated work in production. But I feel that we're as close as we've ever been and we're getting closer.

User management is extremely — and I hate to use the word "easy" — but it really is. And it's a lot easier today than it was when we first started with Dynatrace. We create a lot of customized dashboards both for the executive teams and management teams. These dashboards are central to their areas of oversight. It used to take quite a bit of time to create dashboards. Now it even has an automated tool that takes care of that. You just tell it what you want it to present and everything falls together. It has templated dashboards that you can customize.

The single agent does all of it. Once you deploy the one agent to your environment, it's going to propagate itself throughout the environment, unless you specifically tell it not to. It is the easiest thing that we've ever owned, because we don't have to do anything to it. It self-maintains. Every once in a while we'll have to reinstall the agent on something or a new version will come out and we'll want to deploy it, but for the most part, it's set-it-and-forget-it.

What needs improvement?

I would love to see Dynatrace get more involved in the security realm. I get badgered by so many endpoint protection companies. It seems like a natural fit to me, that Dynatrace should be playing in that space.

I'd also like to see some deeper metrics in network troubleshooting. That's another area that it's not really into.

For how long have I used the solution?

We're in our fifth year of using Dynatrace. We were the very first paying customer for the new platform, Dynatrace ONE. We used it right at launch.

What do I think about the stability of the solution?

The stability has been phenomenal. I'm not going to say that Dynatrace has never had an outage, but I've never had an outage where Dynatrace wasn't available for me. It's always been there. It's always there when I need it. It's always on. Our uptime is five-nines, and we do attribute a large portion of our ability to maintain that figure to Dynatrace.

What do I think about the scalability of the solution?

In terms of scalability, we don't have anything that it can't do. As we add to our infrastructure, it scales. Yes, every time we add a node, we're going to spend more. But it's up to me to decide if I want to monitor everything or a set of everything. My philosophy is to monitor all of production. Anything that is deployed to production is being monitored by Dynatrace. 

From a dev and test perspective we don't monitor like that. We keep a secondary Dynatrace instance that we use in the event that we need to troubleshoot something in development, but for the most part, our Dynatrace usage is relegated to production. And that's for cost reasons.

We have four environments in our builds. We have production, where we cover everything. We have a development environment, which is a subset of production, with different copies. We have QA, which is where everything goes from development for final testing. And then we have staging, which is the final step before it's pushed to the production clusters.

As we add to production, we add to Dynatrace. That is always going to be the plan. We will not deploy anything to production that doesn't have Dynatrace on it.

I don't get involved in the minutiae, but from what the guys tell me, with Linux servers you don't even blink. They have to watch Windows servers a little bit more because it's more intensive. Windows itself doesn't tend to perform very well when you first build. You've got to massage it and get it to where you want it to be. Dynatrace helps us with that, but Windows is more finicky.

We have about 50 users of Dynatrace between infrastructure, development, operations, and sales.

How are customer service and technical support?

Their technical support is the best ever. I know I sound like a broken record, but we get chat support on the Dynatrace site, not from some guy in India, but from a high-level tech in the US who has all the answers to the questions. That person is not like some first-level guy who's going to ask you if your machine's booted up. The techs can answer our questions and, if they can't, they open the ticket and get back to us later. It's the best support model I've ever had the pleasure of working with.

Which solution did I use previously and why did I switch?

We were using New Relic at the time. We were having a lot of frustrations with that in terms of its dashboarding capabilities, and the amount of time that my people had to spend keeping it updated and running correctly. We started looking at other products and we ended up settling on Dynatrace. Aside from its major capabilities, what Dynatrace ended up doing for us was to assist us in our migration to the cloud, because it gave us the sizing recommendations and the baselines that we needed to formulate what we were going to start with in Azure.

New Relic was the primary APM at the time and we were just very frustrated with it. We started looking at other products and really didn't see much of a difference in the competition, differences that would warrant going through the change, until we came upon what was then called Ruxit and is now called Dynatrace.

The biggest difference was that the other solutions required overhead. My biggest complaint was the amount of time we had to spend with these tools, because they're supposed to save you time, not take up more of your time. Dynatrace was the first one to actually complete that promise.

We ran hybrid for a year, collecting data on both ends, using Dynatrace both on-prem and in the cloud, and now it's all cloud.

How was the initial setup?

The setup is really not much different, whether you're an on-prem organization or a cloud or even a hybrid. It's still the one agent. I have no experience with their AppMon product, so I can't tell you how much easier the new product is versus the old. But I can tell you that this product that we have been using is the easiest thing we've ever had. The only comment I got from my systems team is, "Why didn't we get this sooner?"

I am not the norm when it comes to policy and procedure. I tend to buck the trends a little bit. If I have a new product that I feel is going to be advantageous to the company and my team as a whole, then once we've done our due diligence, we will just deploy it. I know that larger companies with different criteria and regulations have to follow different channels and paths, through security and infrastructure and storage, etc. But ultimately, as long as you have "air-cover," and by that I mean an executive sponsor who believes in what you're doing, then you really should be able to get it done with minimal effort.

We were fully up and running in a week. It took me longer to remove New Relic than it did to deploy Dynatrace. We only needed one person to deploy Dynatrace. One of my systems people took care of it. I took care of the administrative stuff, creating the initial dashboards and getting the payments set up and so forth, but my systems people took care of the actual deployment of the one agent.

What about the implementation team?

I didn't hire any contractors or deployment services. I signed up for Dynatrace's free trial and we went to town.

What was our ROI?

From a monitoring-tool perspective, Dynatrace has saved us money through consolidation of tools. We used to use a number of tools: PRTG, Pingdom, and we used to pay for an additional Azure service that we don't pay for anymore. And we used to use Splunk for log mining and now we don't. Just in the tools that we eliminated it has saved us $30,000, but there are more soft dollars that I could add to that.

I'm not sure how you come up with an ROI because it's pretty much all soft dollars. It's a line item in my budget that doesn't have to grow unless we grow. We have not experienced a base-price increase from Dynatrace.

What's my experience with pricing, setup cost, and licensing?

Dynatrace is not the cheapest product out there and it's not the most expensive product out there. In our business, you get what you pay for. 

Dynatrace has a place for everybody. How you use it and what your budgetary limitations are will dictate what you do with it. But it's within everybody's reach. If you're a small organization and you have a large infrastructure, you may not be able to monitor the whole thing. You may have to pick and choose what you want to monitor, and you have the ability to do so. Your available funds are going to dictate that.

The only additional costs that I incur are for additional log storage space, which is like $100 a year.

What other advice do I have?

My advice would be to compare and compare again. Everybody's offering free trials, and I know that they're a pain to do, but compare the products, apples for apples. Everybody's going to compare costs, but be sure to compare the functionality. Are you getting what you pay for? Are you getting the bang for your buck out of what the product is returning to you? If all you need to know is "my server's down," you can probably get by with the cheapest thing out there. But if you want to know why the server is down, or that the server is about to go down and you need to do something, then you want a product like Dynatrace.

I go to their Perform conference every year, and it's amazing to me to see the loyalty and dedication from the customer side. It's like a family reunion every year when we go to Perform. I hope we have it next year.

From a core-product perspective, Dynatrace is doing everything that we ever asked for. Everything that we've ever wanted to monitor, it has always been there first.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
RM
IT Technical Architect at a insurance company with 5,001-10,000 employees
Real User
Top 5Leaderboard
Provides traceability from tracing transactions of end users all the way through the back-end systems

Pros and Cons

  • "It has improved our critical incident response, exposing critical issues impacting the environment and our ability to respond to those events prior to client impact as well as resolving those events more quickly. We have use cases where we have studied a 70 percent improvement for response times in an occurring event as well as future reoccurrences being improved."
  • "We can see issues that occur, sometimes before the clients do. Before we have client (or end user) calls for issues, we are able to start troubleshooting and even resolve those issues. We can quickly identify the root cause and impact of the issues as they occur, and this is very helpful for providing the best client experience."
  • "There continues to be some opportunity to expose the infrastructure from a broader reporting standpoint. Overall, the opportunity is in the reporting capability and the ability to more flexibly expose or pivot the data for deeper analysis. Oftentimes, the solution is good at looking narrowly at information, but when you want to broaden that perspective, that's where the challenges come in. At this point, it requires the export of data to external systems to do this."

What is our primary use case?

Our primary use cases are operational awareness, health of the systems, and impact on users. Other use cases include proactive performance management, system checkouts (as we investigate the ability to manage configuration and integration to the CMDB), some usage of it from a product perspective in terms of application usage, and I use it to manage and improve the user experience by understanding user behaviors.

We are in both Azure and AWS. We have both on-premise and cloud Kubernetes environments that we're running in. In fact, we have been using less efficient deployment methodologies. We haven't encountered any limitations in scaling to cloud-native environments. 

We have only used version 1.192 of the Dynatrace product. We have not used any previous versions.

How has it helped my organization?

It has improved our critical incident response, exposing critical issues impacting the environment and our ability to respond to those events prior to client impact as well as resolving those events more quickly. We have use cases where we have studied a 70 percent improvement for response times in an occurring event as well as future reoccurrences being improved.

The solution's use of a single agent for automated deployment and discovery helps our operations significantly. Oftentimes when you are looking at endpoint management, centralized monitoring teams need access to data across systems. They need to manage agents deployed throughout the organization. Remote polling of data can be helpful, but it's not deep enough, especially for APM capabilities. Having one agent significantly simplifies that functionality in such a way that it enables a very small team to manage a very large environment with very limited overhead. It provides the ability for external teams to manage it because they don't need any deeper knowledge of the application than installing the agent. They have the ability to integrate the agent into deployments and to do the work with very limited overhead.

The automated discovery and analysis helps us to proactively troubleshoot production and pinpoint the underlying root cause. We have had scenarios where we can see end user impact. One of the use cases was where we had an individual system and a cluster of nine for a content management system that was having an issue. Through Dynatrace, we were able to quickly identify the one host that was having a problem, take that out of the active cluster, recycle that application instance, bring it back, and reintroduce it to the cluster in a very efficient manner. Historically, these processes take multiple hours in order to diagnose and identify the instance, then do the work. With Dynatrace, we are able to do the work in less than 20 minutes from when it first occurred to issue resolution. Thus, there have been scenarios where you can quickly identify infrastructure issues and back-end services. 

Out-of-the-box, it's the best product that I've seen. Its ability to associate application impact, as well as root cause from an infrastructure standpoint, is by far ahead of anything that I have seen due to its ability to associate infrastructure anomalies to applications. We are still on our journey of identifying the right business KPIs to see how we can associate this data.

Dynatrace is doing an excellent job of giving us 360-degree visibility of the user experience across channels in most technologies. We are working with Dynatrace to expose the full transparency to the mainframe, as we have transactions that call from the cloud onto the mainframe and back out to other services. This is a critical visibility that isn't there yet. Otherwise, with a lot of the cloud and historical systems, we do see a lot of transparency of transaction trace across the environment.

What is most valuable?

  1. Automated discovery
  2. Automated deployments
  3. The AI

These are probably the most key, because it gets into the traceability from tracing transactions of the end user all the way through the back-end systems. We are still working through the mainframe integration, but the scenarios where we can integrate through the mainframe are very useful.

We can see issues that occur, sometimes before the clients do. Before we have client (or end user) calls for issues, we are able to start troubleshooting and even resolve those issues. We can quickly identify the root cause and impact of the issues as they occur, and this is very helpful for providing the best client experience.

We have found the self-management of the management cluster and Dynatrace processes to be highly reliable. There have been minimal issues with managing the infrastructure.

We've targeted deployment of the real-user monitoring to the most critical applications in the company to understand if there's something that's happening in the environment and the user impact. This is to be able to understand the blast radius of issues, helping us understand if an issue is impacting one app or multiple applications. We can then quickly diagnose where the common event is (the root cause), resolve it, and then leverage the product to validate healthy user traffic after completion by seeing transactions be processed again. 

From a synthetic standpoint, we use the synthetics in two ways: 

  1. We do lower-level infrastructure pings (HP pings) primarily in order to validate individual, technology services on the back-end, i.e., the API endpoints. 
  2. We use the front-end synthetics to validate user experience 24/7. When you have low usage periods, you are still able to validate the availability and performance of services to the organization. Oftentimes, changes may be implemented to reduce risk during lower usage times and the synthetics can be valuable to validate during that time.

It has been very easy to deploy and obtain basic information. 

It's very good from a problem troubleshooting perspective.

What needs improvement?

I find the value from the out-of-the-box features to be extremely valuable. However, there will be gaps and challenges as you go into a much broader set of infrastructure technologies to consume that necessary information. This will be a challenge for the company. The things that they need to focus on is the ease of integrating external data sources, which can then also contribute to the AI. There is a ton of value gotten out-of-the-box, but moving to the next steps will be an interesting journey. I know this is something they are focused on now. When bringing in other telemetry, whether it be network devices, databases, or other third-party products that all integrate into a larger ecosystem, there will also be a lot of successes, but there will also be some challenges on this journey.

There is some complexity in the alarm processing logic within the product between the alert policies and problem notifications.

Expand the user session query data to be inclusive and enable that for the application or other telemetry within the system. Currently, in order to analyze the data outside of dashboards, it requires exporting to other reporting systems. If you want to do higher level reporting, then this may make sense. However, there is a desire to be able to do some of that analysis within the product.

There continues to be some opportunity to expose the infrastructure from a broader reporting standpoint. Overall, the opportunity is in the reporting capability and the ability to more flexibly expose or pivot the data for deeper analysis. Oftentimes, the solution is good at looking narrowly at information, but when you want to broaden that perspective, that's where the challenges come in. At this point, it requires the export of data to external systems to do this.

Adoption lagged primarily due to:

  1. The prioritization of monitoring as a functionality when teams do their work, as our teams are more focused on business functionality than nonfunctional requirements.
  2. Getting familiar with the navigation of the product. With our implementation, we have a single node where people get access to all the data within the enterprise. They're able to see everything. It takes time working through the process and getting the correct set of tags and everything else in place to allow them to filter and limit data to what they need to see and can consume. It takes some time for them to understand the data, what's there, and how to consume it as we learn how to limit the data sets to what they really want to see.

For how long have I used the solution?

About two years.

What do I think about the scalability of the solution?

At this point, we have about 1700 host units. We're monitoring 2000 to 3000 systems. We have 300 to 500 users a month using the systems with approximately 700 users overall. 

How are customer service and technical support?

Their Tier 0 is better than most companies that I have ever worked with. Normally, I'll get useful information even at that initial level/Tier 0. 

The in-app chat is extremely helpful. It helps not only with the ability for me to troubleshoot, but the ability for the rest of the organization to ask how-to questions. We have hundreds of those chats across the organization per month which are leveraged by end users.

Everything else is as expected when working through engineering and our product specialists, who have been helpful.

How was the initial setup?

The initial setup and implementation are almost too easy. With real-user monitoring and all the application monitoring, you are introducing change into the environment. It is so easy to set up, configure, and implement that you can get way ahead of your organization technically from where they are from a usability standpoint. We have run into virtually no technical limitations in implementing the product. It has purely been from the ability to get users to adapt, understand, and leverage the value of the product.

We implemented and installed the Dynatrace platform (and everything) within a couple of days. We deployed the product in certain environments within overnight of instrumentation. Onboarding of teams and the training required, that took months. Even though we were able to technically implement the product from non-production into production within a month of deploying everything, having it there, and instrumented. It took us another eight to nine months to onboard individual teams into adopting and leveraging the product. From there, the rolling out is really limited more by organizational change, communication, and facilitating training with teams and their technical capabilities. Key teams have adopted the product and used it very quickly. Therefore, we are seeing value within four weeks of deployment from our centralized critical incident teams, but the product adoption from application and development teams has lagged.

If you are implementing Dynatrace, the first thing is to not underestimate your users and their experience, providing them personal service to onboard and consume the information, then leverage the product on the front-end. Technically the product makes it so easy to implement and deploy, this makes it difficult to stay in front of the rest of the organization when adopting the product. You need to ensure the data starts presenting itself before they are ready and able to consume it. You need to focus that into your implementation.

What was our ROI?

The solution has decreased both our MTTI and MTTR.

In 2018, we were having on average one issue per day. It is one of the reasons that we purchased the product in 2018. Last year, we significantly drilled those numbers down in outage time by 70 to 80 percent, as an organization. While Dynatrace is part of driving that avoidance as well as reduced outage time, it's impossible for us to have a direct correlation of its direct impact because there are so many other factors at play in an organization. I had to change management processes and everything else that could also influence that. However, we know that it was part of that increased uptime to where we've decided to invest significantly more in the product.

What's my experience with pricing, setup cost, and licensing?

It's understandable to do a smaller scale initial evaluation. However, as you identify the product value, don't hesitant in your scope and scale to maximize the initial investment and your opportunity to do a bulk investment of the product.

Which other solutions did I evaluate?

We have other competitive products. The automation instrument will be extremely valuable as we look to consolidate our solution set. The insight to quickly gain information is interesting and good information that we can use. There will be a challenge internally with our teams since application teams were never exposed to infrastructure information and infrastructure teams have never been exposed to application nor end user information. Organizationally, we have to change where people are now going to see this insight and figure out how to leverage it for good, which will be helpful. It will be a game changer in terms of how we can identify and respond to events in the organization from the point of view of data and analysis, as opposed to tribal knowledge and fear.

Dynatrace was initially brought in to eliminate one competitive APM product. We are now on to eliminating the second, and we'll be consolidating all APM on the Dynatrace platform. We are also in the process of consolidating other infrastructure monitoring products on the platform. We expect there will be a small incremental investment from a purely licensing standpoint to consolidate the products, but we expect realization of a significant amount of benefit from the capabilities it provides from root cause analysis, impact analysis, transaction trace observability in the environment, the reduced administrative costs of disparate products, and the ability to integrate data. However, a lot of these were not measured previously because we had a lot of disparate tools across disparate teams managing things. Therefore, we can't measure the savings but we expect it will be significant.

We have CA APM Introscope, New Relic, and AppDynamics. We are users of all three of these products, though we are probably using AppDynamics the least. We have almost completely migrated away from Broadcom and are starting the replacement of New Relic.

Holistically, Dynatrace's traceability starts from the user endpoint, meaning the ability to trace a transaction from a user session all the way through other technologies. We've had more comprehensive traces than with other products. Other products do not offer an easy interface to see the trace of the user session in a comprehensive way. Dynatrace offers the ability to go from a mobile, microservices, or mainframe and be able to trace across all those platforms. It also has the ability to associate or automatically correlate user transactions to applications, then into the underlying infrastructure components. Another Dynatrace benefit is the whole function of the AI as well as bringing in other external data sources. E.g., we are looking at things like a DataPower and F5 data integrations, but also incorporating those into the trace. Finally, there is support of legacy technologies, because it really gets into traceability, AI, and the supportive legacy. Mainframe technologies are the big positive differentiators and kind of come to a conclusive root cause analysis.

CA APM Introscope and New Relic have simpler interfaces to consume data. With Dynatrace, you need to develop plugins to obtain easier API interfaces for pushing data into other products. This is a little easier with the other products. The New Relic Insights product is a stronger reporting feature than what Dynatrace provides.

There are also other products that we are looking at eliminating in other product suites, such as Broadcom UIM, Microsoft SCOM, and Zabbix. We have a lot open source solutions where we're looking to roll out infrastructure, then consolidate and centralized data. The primary function and capabilities gets into mobile to mainframe traceability in order to simplify or expedite impact and root cause analysis processes for the teams. The solution also has the ability to support our modern technologies running in AWS and Kubernetes cluster microservices as well as traceability all the way through the mainframe.

What other advice do I have?

We have integrated our notification systems through PagerDuty, Slack, and our auto ticketing app. This is to generate incident records. The integrations with PagerDuty and Slack are effective. We're in the process of migrating some tools to ServiceNow. Thus, we are in the process of doing synchronization of both the events while also evaluating the CMDB integration with ServiceNow. There are some recent capabilities that make this look more attractive to automate discovery and relationship building that we're looking forward to, but we have not yet implemented. The integration to ServiceNow will be good.

The desire is to have Dynatrace help DevOps focus on continuous delivery and shift quality issues to pre-production. We are not there yet. The vision is there and it makes sense with the information that we see, but we have not had the opportunity. Even though we've been using the product now for two years, we're only now just starting an effort to roll the product out across the enterprise and replace competitive products for application infrastructure monitoring. We'll then have the opportunity for that full CI/CD integration or NoOps opportunity.

We will be rolling out to some highly dense environments in the near future. We haven't run into any performance issues yet. The only issue that we ran into previously is with the automated instrumentation of the product. We accidentally disabled the competitive products that teams were using as we were evaluating Dynatrace. You can get in front of yourself in rollout.

We don't have the solution’s self-healing functionality integrated into the automation product. Dynatrace doesn't have the self-healing capability of restarting services. Therefore, from a monitored application perspective, we haven't enjoyed that capability yet.

We are in the process of testing some parts of the session replay. We see value there and are working through understanding the auditory or compliance impacts to leverage this feature.

Based on my experience and history of the products, I would rate it at least a nine (out of 10). It's been far superior to other products in its capabilities and comprehensiveness, especially across both cloud and legacy technologies, such as older technologies (like mainframes and server-based monolithic applications).

Which deployment model are you using for this solution?

On-premises
Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Kevin McNall
Director, Digital Projects and Practices at Rack Room Shoes
Real User
Top 20
Allows our team to focus more on innovation, rather than on monitoring and bug-squashing

Pros and Cons

  • "The alerting systems are definitely the most valuable feature. The AI engine, "Davis," has proved to be a game-changer for us, as it helps to alert us when there are anomalies found in our applications or in their performance... letting the Davis engine find those anomalies and push them to the top, especially as they relate to business impact, is very valuable to us."
  • "The one area that we get value out of now, where we would love to see additional features, is the Session Replay. The ability to see how one individual uses a particular feature is great. But what we'd really like to be able to see is how a large group of people uses a particular feature. I believe Dynatrace has some things on its roadmap to add to Session Replay that would allow us those kinds of insights as well."

What is our primary use case?

We are using it to monitor our e-commerce applications and the full stack that our e-commerce applications run on. That includes both our Rack Room Shoes domain and our Off Broadway Shoes domain. We use it to monitor the overall health of the entire stack, from the hardware all the way to the user interface. And more specifically, we use it to monitor the real user's experience on the front-end.

How has it helped my organization?

What Dynatrace has really allowed our team to do is focus more on innovation, rather than on monitoring and bug-squashing. Now that we have a tool like Dynatrace, we can continue to do forward-thinking projects while Dynatrace is doing the monitoring and rooting out the root causes. We're spending a lot less time trying to find out what the problem is, versus letting Dynatrace pinpoint where the problem is. We can then validate and remediate much quicker. That's the impact it's had on our business.

The automated discovery and analysis helps us to proactively troubleshoot production and pinpoint underlying root cause. We recently had some issues with database connections. Our database team was scratching their heads, not really knowing where to look. What we were able to do with Dynatrace, because we had some of the Oracle Insights tools built into the database, was to provide, down to the SQL statement, what queries were taking up the most resources on that machine. We provided that to the database team and that gave them a head-start in being able to refactor the data so it was quicker to query. That really helped us speed up the user experience for that specific issue.

Dynatrace helps DevOps to focus on continuous delivery and to shift quality issues to pre-production. We are just now starting to use it in that way. When we first launched Dynatrace, we only had monitoring in our production environment. At that point we were using it as an up-front, first-alert tool for any issues that were happening. Now what we're doing is instrumenting our lower environments with Dynatrace so that it will allow us to monitor our load-testing in those environments, to find out where our breaking points are. So it does allow us to push out products that are much more stable and much less buggy because we're able to find out where our breaking points are in the lower environments. What this is going to do is allow us to do is push out, at a faster rate, more solid, less buggy releases and customer features, and allow us to continue to innovate on the next idea. We're just starting that journey. We just got fully instrumented in our lower environments in the last couple of weeks.

In terms of 360-degree visibility into the user experience across channels, we're only monitoring our digital channels right now, specifically our e-commerce channels. But we do have ways, even within the channel, to dissect by the source they came from. Did a given customer come from a digital ad? Did they come from an email? Did they come to us direct? It does allow us to segment our customers and see how each segment of customer performs as well. This is important for us because we want to make sure that we're not driving specific segments of customers into a bad-performing experience or to a slow response time. It also allows us to adequately determine where to spend our marketing dollars.

Another benefit is that it has definitely decreased our mean time to identification, with the solution and the Davis AI engine bringing the most probable root cause to the top. And within that, it gives us the ability to drill down into the specific issue or query or line of code that is the issue. So it has saved us a lot of time — I would estimate it has saved us 10 hours a week — in remediating issues and trying to find the root cause.

It has also improved uptime, indirectly. Because it gives us alerts early, we're able to mitigate issues before they're actually bigger issues.

What is most valuable?

The alerting systems are definitely the most valuable feature. The AI engine, "Davis," has proved to be a game-changer for us, as it helps to alert us when there are anomalies found in our applications or in their performance. We find that very helpful. There's still a human element to the self-healing capabilities. I wish I could say, "Oh, it's magic. You just plug it in and it fixes all your problems." I wouldn't say that, but what I would say is that the Davis engine gives us that immediate insight and allows us to cater to our solution so that the next time that problem arises it can mitigate it without a lot of human involvement.

Dynatrace's ability to assess the severity of anomalies, based on the actual impact to users and business KPIs, is really good, out-of-the-box. But it does an even better job when, again, we as humans give more instruction and provide more custom metrics that we're trying to monitor that are key to our business. And then, letting the Davis engine find those anomalies and push them to the top, especially as they relate to business impact, is very valuable to us.

We find the solution's ability to provide the root cause of our major issues, down to the line of code that might be problematic, to be valuable.

And we get a lot of value out of the Session Replay feature that allows us to capture up to 100 percent of our customers' real user experiences. That's helped us a lot in being able to find obscure bugs or make fixes to our applications. 

We also use real-user monitoring and Synthetic Monitoring functionalities. We use real-user monitoring for load times, speed index, and overall application index. And we use Synthetic Monitors to make sure that even certain outside, third-party services are available to us at all times. In certain cases, we have been reliant on a third-party service, and our Dynatrace tool has let us know that that service isn't available. We were able to remove that service from our website and reach out to the service provider to find out why it wasn't available.

We also find it to be very easy to use, even for some of our business users. Most of the folks who use the Dynatrace tool do tend to be in the technical field, but use is spread across both the business side, what we call our omni-channel group, as well as our IT group. They all use it for different purposes. I'm beginning to use it on the business side to show the impact that performance has on revenue risk. I can then go back and show that when we have bad performance it affects revenue. And I can put a dollar amount on that. So the user interface is very easy to use, even for the business user.

What needs improvement?

Dynatrace continues to innovate, and that's especially true in the last couple of years. We have continued to provide our feedback, but the one area that we get value out of now, where we would love to see additional features, is the Session Replay. The ability to see how one individual uses a particular feature is great. But what we'd really like to be able to see is how a large group of people uses a particular feature. I believe Dynatrace has some things on its roadmap to add to Session Replay that would allow us those kinds of insights as well.

For how long have I used the solution?

We started using Dynatrace in September of 2017. At that time it was an older product called AppMon. But we quickly upgraded to the current Dynatrace platform the following year. We've been using the SaaS platform ever since.

What do I think about the stability of the solution?

It's been very stable. We've had very little downtime. In the last four years there may have been one outage. Overall, it's been extremely stable. Many times, Dynatrace is our first alert that we have issues with other platforms.

What do I think about the scalability of the solution?

It's extremely scalable. We're one of the small players. We're running with about 70 agents right now. We've been at Dynatrace's conferences and have heard of customers who can deploy 5,000 agents over a weekend and have no issues at all. For our small spec-of-sand space, it's extremely scalable.

We are hosted on Google cloud. That's where all of our VMs are currently set up. Our database is there, our tax server is there. All of our application and web servers are there, and Dynatrace is monitoring all of that for us. We haven't encountered any limitations at all in scaling to our cloud-native environment. We can spin up new auxiliary servers in a matter of minutes and have Dynatrace agents running on them within 15 minutes. We're starting to play a little bit with migrating a version of our application into a Kubernetes deployment and using Dynatrace to monitor the Kubernetes containers as well.

We have plans to increase our usage of Dynatrace. We just recently updated our hosts. We needed to increase the number of host units so that we could put Dynatrace on more servers, and we've already just about used up all of those. So next year, we'll likely have to increase those host units again. And we're going to start using more pieces of Dynatrace that we haven't used before, like management zones and custom metrics.

How are customer service and technical support?

Technical support has been great. The first line of defense is their chat through the UI, which is really simple. They're super-responsive and usually get back to us within minutes. We have a solutions engineer that we can reach out to as well, and they have been very helpful, even with things like setting up training sessions and screen-sharing sessions to help enable our internal teams to be more productive using the tool.

Which solution did I use previously and why did I switch?

We were using a tool called New Relic and we were really just using it as a synthetic monitor to make sure the application was up and running, but we really weren't getting a lot of insights. When we decided that we wanted a tool that could give us more insights and that we needed a tool that could give us the ability to monitor more of our customers' behaviors, there just wasn't another tool like Dynatrace that we felt could do things as well as Dynatrace, through a "single pane of glass." We chose Dynatrace over New Relic at the time because New Relic just didn't have any solutions like it.

We haven't found another tool that can help us visualize and understand our infrastructure, and do triage, like Dynatrace. We haven't found one that can give us that full visibility into the entire stack from VM all the way to the UI. That was really the reason we picked Dynatrace. There just wasn't another tool that we felt could do it like Dynatrace.

The fact that the solution uses a single agent for automated deployment and discovery was the second reason that we chose Dynatrace. The ease of deployment, the fact that we could use the one agent and deploy it on the host and suddenly light up all of these metrics, and suddenly light up all of these dashboards with insights that we didn't have before, made it extremely attractive. It required a lot less on our part to try to do instrumentation. Now, as we add more Dynatrace agents to more of our back-end servers, we think we'll gain even more value out of it.

How was the initial setup?

We started with AppMon, which was more of an on-premise version, where we were installing it, although it still was a one-agent. Then we moved to the SaaS solution, and it was very easy for us to migrate from AppMon to the SaaS solution, and it's been extremely easy to instrument new hosts with the agent.

We were up and running within 30 days when we were first engaged with AppMon. When we migrated to the SaaS solution, it maybe took another 30 days and might have even been less. I wasn't involved with that migration, but I worked closely with the guy who was. I don't remember it taking much longer than 30 days to migrate.

We had an implementation strategy. We knew specifically which application we wanted to monitor, and all of the hardware and services and APIs that that application was dependent on. We went in with a strategy to make sure that all of those things were monitored. And now we've progressed that strategy to start monitoring more of our internal back-end systems as well — the systems that support our stores, not just our e-commerce channel — to see if we can't get more value and maybe even realize more cost savings on our brick and mortar side using Dynatrace.

What was our ROI?

We have definitely seen return on our investment. It has come in the form of being able to produce more stable, less buggy applications and features, and in allowing our team to focus more on innovating new ideas that drive revenue and business, versus maintaining and troubleshooting the existing application.

It hasn't yet saved us money through consolidation of tools, but as we continue to find more value in Dynatrace, it does make us look at other tools and see if we are able to use Dynatrace to consolidate them. We have replaced other application monitoring tools with Dynatrace, but we've not yet consolidated tools.

What's my experience with pricing, setup cost, and licensing?

Whatever your budget is, you can manage Dynatrace and get value out of it, but you need to manage it to what your needs are. That's the one thing we found. We did not budget the right amount to begin with. It has cost us more in the long run than if we would have been able to negotiate it upfront. But we didn't really know what we didn't know until we'd been using Dynatrace for awhile.

Your ability to catch your Session Replay is based on the number of what they call DEM units, digital experience monitoring units. That's where we were short to begin with. There is an additional expense to determining not just the platform subscription but also the number of hosts units that you want to run and the number of DEM units that you need to be able to capture all of the user experiences that you want. In our case, we wanted the ability to capture 100 percent. Maybe in another business someone would only be worried about capturing a sampling of the traffic.

Which other solutions did I evaluate?

We evaluated New Relic, AppDynamics, AppMon, which was the Dynatrace solution at the time, and we also looked at Rigor.

Dynatrace could do pretty much everything. It wasn't just the real-user monitoring piece of it. It was also the full stack health aspect. The Davis AI engine was probably the biggest differentiator among all of the tools. The Davis AI engine and its ability to surface the root cause was a game-changer.

What other advice do I have?

My advice would be to jump all-in. There doesn't seem to be another tool that can do it like Dynatrace, and from what we've seen the last two times we've gone to their Dynatrace Perform conferences, they are dedicated to innovating and adding features to the platform.

We are not yet using Dynatrace for dynamic microservices within a Kubernetes environment. We are beginning to play in that arena. We're looking at tools that will help us migrate from our current VM architecture to a Kubernetes deployment architecture, to enable us to get more into a no-DevOps type of environment. But today, we're still on a virtual machine deployment architecture.

Similarly, we have not integrated the solution with our CI/CD and/or ITSM tools. That is on our roadmap. As we migrate and transition into a no-DevOps and continuous improvement/continuous deployment operation, we'll begin to use Dynatrace as part of our deployment processes.

The solution hasn't yet decreased our time to market for new innovations or capabilities, but we believe that we will realize that benefit going forward, since we'll be leveraging Dynatrace in our lower environments to find out where breaking points are of new features that we release.

We have half-a-dozen regular users who range from our e-commerce architect to DevOps engineers to front-end software developers. My role as a user is more of a senior-level executive or sponsor role. We also have some IT folks, some database administrators and some CI people, but most of our users are in the IT/technical realm.

We don't have a team dedicated to maintaining the solution. We do have a team responsible for it, though. That is the team that just helped instrument our lower environment with Dynatrace. We've got some shared responsibilities and some deployment instructions that are shared across three different groups. They're from IT, our omnichannel group, which is really our business side, and we leverage a third-party for staff augmentation and they use Dynatrace to help us monitor during our off-hours.

Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Richard Mitchell
DevOps Leader at a legal firm with 501-1,000 employees
Real User
Good executive-level dashboards with powerful automation and AI capabilities, but the management interface could be more intuitive

Pros and Cons

    • "The user interface for the management functions is not particularly intuitive for even the most common features."

    What is our primary use case?

    Our primary use case is the consolidation of observability platforms.

    How has it helped my organization?

    Looking at Dynatrace's automation and AI capabilities, automation is generally a great place to start. In products where there has been no observability or a very limited amount, the automation can give a great deal of insight, telling people things that they didn't know that they needed to know.

    Davis will do its best to provide root cause analysis, but you, as a human, are still responsible for joining as many of the dots together as possible in order to provide as big a picture as possible. As long as you accept that you still have to do some work, you'll get a good result.

    I have not used Dynatrace for dynamic microservices within a Kubernetes environment in this company, but I have had an AWS microservice cluster in the past. Its ability to cope with ephemeral incidences, as Kubernetes usually are, was very good. The fact that we didn't have to manually scale out to match any autoscaling rules on the Kubernetes clusters was very useful. Its representation of them at the time wasn't the best. Other products, Datadog, for example, had a better representation in the actual portal of the SaaS platform. That was about three years ago, and Dynatrace has changed, but I haven't yet reused the Kubernetes monitoring to see if it has improved in that regard.

    Given that Dynatrace is a single platform, as opposed to needing multiple tools, the ease of management is good because there is only one place to go in order to manage things. You deal with all of the management in one place.

    The unified platform has allowed our teams to better collaborate. In particular, because of the platform consolidation, using Dynatrace has made the way we work generally more efficient. We don't have to hop between seven different monitoring tools. Instead, there's just one place to go. It's increased the level of observability throughout the business, where we now have development looking at their own metrics through APM, rather than waiting until there's a problem or an issue and then getting a bug report and then trying to recreate it.

    It's increased visibility for the executive and the senior management, where they're getting to see dashboards about what's happening right now across the business or across their products, which didn't used to exist. There's the rate at which we can monitor new infrastructure, or applications, or custom devices. We had a rollout this week, which started two days ago, and by yesterday afternoon, I was able to provide dashboards giving feedback on the very infrastructure and applications that they had set the monitoring up on the day before.

    As we've only been using Dynatrace in production for the past month in this company, the estimate as to the measurement of impact isn't ready yet. We need more time, more data, and more real use cases as opposed to the synthetic outages we've been creating. In my experience, Dynatrace is generally quite accurate for assessing the level of severity. Even in scenarios where you simply rely on the automation without any custom thresholds or anything like that, it does a good job of providing business awareness as to what is happening in your product.

    Dynatrace has a single agent that we need to install for automated deployment and discovery. It uses up to four processes and we found it especially useful in dealing with things like old Linux distros. For example, Gentoo Linux couldn't handle TLS 1.2 for transport and thus, could not download the agent directly. We only had to move the one agent over SSH to the Gentoo server and install it, which was much easier than if we'd had to repeat that two or three times.

    The automated discovery and analysis features have helped us to proactively troubleshoot products and pinpoint the underlying root cause. There was one particular product that benefited during the proof of concept period, where a product owner convened a war room and it took about nine hours of group time to try and reason out what might be the problem by looking at the codebase and other components. Then, when we did the same exercise for a different issue but with Dynatrace and the war room was convened, we had a likely root cause to work from in about 30 minutes.

    In previous companies where the deployment has been more mature, it was definitely allowing DevOps to concentrate on shipping quality rather than where I am now, which is deploying Dynatrace. The biggest change in that organization was the use of APM and the insights it gave developers.

    Current to the deployment of Dynatrace, we adopted a different methodology using Scrum and Agile for development. By following the Scrum pattern of meetings, we were able to observe the estimated time in the planning sessions for various tasks. It started to come down once the output of the APM had been considered. Ultimately, Dynatrace APM provided the insight that allowed the developers to complete the task faster.

    What is most valuable?

    The most valuable features for us right now are the auto-instrumentation, the automatic threshold creation, and the Davis AI-based root cause analysis, along with the dashboarding for executives and product owners.

    These features are important because of the improved time it takes for deployment. There is a relatively small team deploying to a relatively large number of products, and therefore infrastructure types and technology stacks. If I had to manually instrument this, like how it is accomplished using Nagios or Zabbix, for example, it would take an extremely long time, perhaps years, to complete on my own. But with Dynatrace, I can install the agent, and as long as there is a properly formed connection between the agent and the SaaS platform, then I know that there is something to begin working with immediately and I can move on to the next and then review it so that the time to deployment is much shorter. It can be completed in months or less.

    We employ real user monitoring, session replay, and synthetic monitoring functionalities. We have quite a few web applications and they generally have little to no observability beyond the infrastructure on which the applications run. The real user monitoring has been quite valuable in demonstrating to product owners and managers how the round-trips, or the key user actions, or expensive queries, for example, have been impacting the user experience.

    By combining that with session replay and actually watching through a problematic session for a user, they get to experience the context as well as the raw data. For a developer, for example, it's helpful that you can tell them that a particular action is slow, or it has a low Apdex score, for example, but if you can show them what the customer is experiencing and they can see state changes in the page coupled with the slowness, then that gives a much richer diagnostic experience.

    We use the synthetics in conjunction either with the real user monitoring or as standalone events for sites that either aren't public-facing, such as internal administration sites, or for APIs where we want to measure things in a timely manner. Rather than waiting for seasonal activity from a user as they go to work, go home, et cetera, we want it at a constant rate. Synthetics are very useful for that.

    The benefit of Dynatrace's visualization capabilities has been more apparent for those that haven't used Dynatrace before or not for very long. When I show a product owner a dashboard highlighting the infrastructure health and any problems, or the general state of the infrastructure with Data Explorer graphs on it, that's normally a very exciting moment for them because they're getting to see things that they could only imagine before.

    In terms of triaging, it has been useful for the sysadmins and the platform engineering team, as they normally had to rely on multiple tools up until now. We have had a consolidation of observability tools, originally starting with seven different monitoring platforms. It was very difficult for our sysadmins as they watched a data center running VMware with so many tools. Consolidating that into Dynatrace has been the biggest help, especially with Davis backing you up with RCAs.

    The Smartscape topology has also been useful, although it is more for systems administrators than for product owners. Sysadmins have reveled in being able to see the interconnectedness of various infrastructures, even in the way that Dynatrace can discover things to which it isn't directly instrumented. When you have an agent on a server surrounded by other servers, but they do not have an agent installed, it will still allow a degree of discovery which can be represented in the Smartscape topology and help you plan where you need to move next or just highlight things that you hadn't even realized were connected.

    What needs improvement?

    The user interface for the management functions is not particularly intuitive for even the most common features. For example, you can't share dashboards en masse. You have to open each dashboard, go into settings, change the sharing options, go back to dashboards, et cetera. It's quite laborious. Whereas, Datadog does a better job in the same scenario of being a single platform of making these options accessible.

    User and group management in the account settings for user permissions could be improved.

    The way that Dynatrace deals with time zones across multiple geographies is quite a bone of contention because Dynatrace only displays the browser's local time. This is a problem because when I'm talking with people in Canada, which I do every day, they either have to run, on the fly, time recalculations in their heads to work out the time zone we're actually talking about as relevant to them, or I have to spin up a VM in order to open the browser with the time zone set to their local one in order to make it obvious to them without them having to do any mental arithmetic.

    For how long have I used the solution?

    Personally, I have been using Dynatrace since November of 2018. At the company I am at, we have been using it for approximately four months. It was used as a PoC for the first three months, and it has been in production for the past month.

    What do I think about the stability of the solution?

    The SaaS product hasn't had any downtime while I've been at my current company. I've experienced downtime in the past, but it's minimal.

    What do I think about the scalability of the solution?

    To this point, I've not had any problems with the scalability, aside from ensuring that you have provisioned enough units. However, that is another point that is related to pricing.

    Essentially, its ability to scale and continue to work is fine. On the other hand, its ability to predict the required scalability in order to purchase the correct number of various units is much harder.

    How are customer service and support?

    Talking about Dynatrace as a company, the people I've spoken to have always been responsive. The support is always available, partly because of our support package. As a whole, Dynatrace has always been a very responsive entity, whether I've been dealing with them in North America or in the UK.

    Which solution did I use previously and why did I switch?

    We have used several other solutions including Grafana, Prometheus, Nagios, Zabbix, New Relic, AWS CloudWatch, Azure App Insights, and AppDynamics. We switched to Dynatrace in order to consolidate all of our observability platforms.

    Aside from differences that I discuss in response to other questions, other differences would come from the product support rather than the product itself. Examples of this are Dynatrace University, the DT One support team, the post-sales goal-setting sessions, and training.

    We're yet to have our main body of training, but we're currently scheduled to train on about 35 modules. Whereas, last year, when I rolled out Datadog, the training wasn't handled in the same way. It was far more on request for specific features. Whereas, this is an actual curriculum in order to familiarize end users with the product.

    How was the initial setup?

    In my experience, the initial setup has been straightforward, but I've done it a few times. When I compare it to tools like Nagios, Zabbix, Grafana, and Prometheus, it is very straightforward. This is largely for two reasons.

    First, they're not SaaS applications, whereas Dynatrace is, and second, the amount of backend configuration you have to do in preparation for those tools is much higher. That said, if we were to switch to Dynatrace Managed rather than Dynatrace SaaS, I imagine that the level of complexity for Dynatrace would rise significantly. As such, my answer is biased towards Dynatrace SaaS.

    What was our ROI?

    In my previous company, it allowed a very small team to manage what was a very fast-moving tech stack. In my current company, it is still very early.

    The consolidation of tools due to implementing Dynatrace has saved us money, although it's tricky to measure the impact. The list price of Dynatrace was more than the previous list price spend on monitoring tools because the various platforms had been provided as open-source tools, were provided through hosting companies, or had been acquired as part of acquisitions of other companies.

    The open-source applications that we used included Grafana, Prometheus, Nagios, and Zabbix. New Relic through Carbon60 in Canada, as an example, was provided through a hosting company. Also, we acquired a Canadian company or had been acquired as part of acquisitions of other companies, AppDynamics, in a Canadian company, for example, with us in the budget of the previous company rather than our own company.

    The hope was that Dynatrace through consolidation would release the material cost of the administrative overheads of tools like Prometheus and Grafana and the cost of hosting infrastructure for solutions like Nagios, Zabbix, Prometheus, Grafana, et cetera. This means that it is more of an upstream cost-saving, where we would be saving human effort and hosting costs by consolidating into a SaaS platform, which is pretty much all-in-one.

    What's my experience with pricing, setup cost, and licensing?

    Dynatrace's pricing for their consumption units is rather arcane compared to some of the other tools, thus making forward-looking calculations based on capacity planning quite hard. This is because you have to do your capacity planning, work out what that would mean in real terms, then translate that into Dynatrace terms and try to ensure you have enough Davis units, synthetics units, DEM units, and host units.

    Catching those and making sure you've got them all right for anything up to a year in advance is quite hard. This means that its ability to scale and continue to work is fine but predicting the correct number of various units to purchase is much harder.

    The premium support package is available for an additional charge.

    What other advice do I have?

    At this point, we have not yet integrated Dynatrace with our CICD tool, which is Azure DevOps. However, in the future, our plan is to provide post-release measurements and automated rollbacks when necessary. Even further down the road, there's ServiceNow on the roadmap, which we're currently bringing in from an Australian acquisition in order to try and promote the ITSM side of the business.

    There is nothing specific that has been implemented so far, although there have been general degrees of automation. When we get Agile, DevOps, and ServiceNow in place, the degree of automation will increase dramatically. For example, automated rollbacks in the case of deployment failure or change management automation through the current state of the target system are being included in the ServiceNow automation.

    The automation that has been done to alleviate the effort spent on manual tasks is still very light because I'm the only person doing the work. I generally don't have time to do the ancillary tasks at the moment, such as creating automations. It's mostly a case of deploying instruments, observing, and moving on. When we come back to revisit it, then we'll look at the automations.

    My advice for anybody who is looking into implementing Dynatrace is to make sure you talk constantly with your Dynatrace representatives during the PoC, or trial phase because there is invariably far more that Dynatrace can do than you realize. We only know what we know. I'm not suggesting that you let Dynatrace drive but instead, constantly provide the best practices. You will achieve faster returns afterward, whether that's labor savings, or recovery time, or costs from downtime. Basically, you want to make sure that you leverage the expertise of the company.

    In summary, this is a very good product but they need to sort out their user interface issues and provide a more logical experience.

    I would rate this solution a seven out of ten.

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
    Flag as inappropriate
    Donald Hall
    Manager, Ecommerce Support at a retailer with 1,001-5,000 employees
    Real User
    Top 20
    The ability to capture every single user session on the site and work with our customer support team has been a huge return on investment

    Pros and Cons

    • "My primary use of the tool is to keep revenue coming into the business and to use it to help our business team in running their site analytics and web performance tools. They have things like Adobe Analytics that provide them with one layer of data. We use Dynatrace as another railroad metric to both confirm the Adobe Analytics data and enhance it in certain places where Adobe won't give us the answers that we need. In terms of metrics, we've had roughly about 120,000 unique sessions per hour on our website. So, we're capturing a lot of session data and real user data, and all of that data is kept in user sessions. We can look this information up by user ID to tag any given session that we want to find by date/client. E.g., if the user said that they had an issue last Thursday at 11:00 PM, then we can just do a search on their email address, go through all their sessions, and find the one that they mentioned, then dig directly into that one."
    • "Some of the analytics that you get in, e.g., a waterfall analysis of a web page could be clearer. A lot of that is not directly attributable to Dynatrace. Sometimes a vendor will implement a tag or JavaScript plugin that's named something entirely different than what it does. This makes it difficult to track that from the waterfall list, figure out where exactly that component is, and dig more into what it's doing. Dynatrace could probably improve a bit on that waterfall layout to make it clearer as to what exactly is there. It does a wonderful job of telling you what loads and when, but it could be improved in terms of telling me what exactly it is loading."

    What is our primary use case?

    My use cases are typically working in conjunction with our business partners. For the most part, I get questions from the business as to what our conversion rates are on the eCommerce website. E.g., what the typical user journey looks like, especially when they're doing AB testing. They like to sort of double verify that in Dynatrace with a user session tracking to see which users are taking which path. 

    I often get diagnostic questions about things like latency. Somebody on the business side will perceive some latency on one of our pages, then give me a call to use Dynatrace to go in and do a waterfall analysis of the page load to see if it is in fact loading more slowly than it has been in the past.

    How has it helped my organization?

    Since we are a 24/7 support shop, our primary job is to make sure that revenue keeps going in through the site and the site itself works. We have categorized issues into three tiers: Priority 1, 2, and 3. Depending on where a given issue falls, the Dynatrace alert is generated and sent to my team, then prioritized into one of those three categories. The response then to that issue will fall into a bucket for that type of priority. 

    My primary use of the tool is to keep revenue coming into the business and to use it to help our business team in running their site analytics and web performance tools. They have things like Adobe Analytics that provide them with one layer of data. We use Dynatrace as another railroad metric to both confirm the Adobe Analytics data and enhance it in certain places where Adobe won't give us the answers that we need. In terms of metrics, we've had roughly about 120,000 unique sessions per hour on our website. So, we're capturing a lot of session data and real user data, and all of that data is kept in user sessions. We can look this information up by user ID to tag any given session that we want to find by date/client. E.g., if the user said that they had an issue last Thursday at 11:00 PM, then we can just do a search on their email address, go through all their sessions, and find the one that they mentioned, then dig directly into that one.

    It has completely transformed the way that we do eCommerce from a couple of different perspectives. The first one is that we are really tied in with our business users to a much greater extent than we were in the past. A lot of the Dynatrace data is data which not just the technical team wants to see and to work with, but it is data that the business team wants to see as well. It has sort of facilitated a better communication stream between my business partners and my team to allow us to go back and forth on different aspects of the website. We didn't have that in the past. 

    It has also built more solid relationships between the application development team and my support team because they will often have questions after new releases as to how those new releases affect different areas of the site. So, we have constant sessions with those guys in standing meetings on a biweekly basis to go through different aspects of what a release has done to the website, how we can use Dynatrace to track differences from one release to the next, and see if there is any latency as a result.

    What is most valuable?

    There are about 24 dashboards built, not only for my eCommerce support team, but I've also built dashboards for our development, business, analytics, and senior management teams that allow them to log into the product. Without having to know much about Dynatrace, they can just click widgets on the dashboard that I've customized for them to get to the information that they need. This saves me a lot of time because I don't have to go in and investigate every single issue that crops up. I can go in and just ensure that there's a dashboard available for diagnosing that type of issue and point the user who is interested in it to that dashboard.

    After the dashboards, the Davis AI engine is fantastic. We're able to set thresholds within the Dynatrace application for what acceptable load times are for our web pages or API callback times. Davis actually monitors what those thresholds are and notifies me, not only when the thresholds are violated, but when they are either over or under the threshold. Then, it makes suggestions for tuning those thresholds based on what it sees in real user actions. Therefore, I'm never dealing with outdated data. Every day Dynatrace is updating what the user experience looks like and letting me have that compared to my benchmarks, which is super useful.

    It's super easy to manage, especially the SaaS solution. We deploy the JavaScript through Tealium, which is one of the tools that we use to deploy tags to our website. Whenever we get a new version, or if we/Dynatrace create it off the Dynatrace JS, then we just deploy it through Tealium and that goes out to every page on our website automatically. Our users' browsers then start getting that new payload dropped into their browsers the next time they visit the website.

    We use both Session Replay and synthetic monitors:

    • A Session Replay is primarily for our customer support operation. We sometimes have customers call in to complain of issues with the website, and it's really useful to be able to look a customer session up and replay it in its entirely so you can see exactly what happened during that user journey. You can even find the point that the user is calling in to raise an issue, so you can dig down and resolve it. We record every single user session on the site. We don't have a limit on the number of sessions we capture because it's just so useful for our customer service folks to be able to do this. It's worth the trade off for us.
    • Synthetic monitoring is set up on a couple of different levels. My primary synthetic runs every five minutes, every day, 365 days of the year. It is a simple, single page pane that tells me whether the website is up or down. If it is down, then an email distribution gets emailed and a pager text goes out to whoever is on call at that time to let them know that the site is down. We also use synthetics for the user journey testing. When new features go in from development, as part of our QA process, we'll often set up Dynatrace synthetic that simulates what that user journey should look like. We will then allow the synthetic to run every given set of minutes (whether it's 10 minutes, 15 minutes, or half an hour), to collect data on what that user journey looks like. This allows us to go in and run our reports against that synthetic module rather than against real user search.

    What needs improvement?

    Some of the analytics that you get in, e.g., a waterfall analysis of a web page could be clearer. A lot of that is not directly attributable to Dynatrace. Sometimes a vendor will implement a tag or JavaScript plugin that's named something entirely different than what it does. This makes it difficult to track that from the waterfall list, figure out where exactly that component is, and dig more into what it's doing. Dynatrace could probably improve a bit on that waterfall layout to make it clearer as to what exactly is there. It does a wonderful job of telling you what loads and when, but it could be improved in terms of telling me what exactly it is loading.

    For how long have I used the solution?

    We have been using it for about 18 months. We first installed it last year in January. 

    What do I think about the stability of the solution?

    It is remarkably stable. They push out new releases of the SaaS application every two to three weeks. Dynatrace has a super active development schedule on their side. When they push out the release to us, all we have to do is push our JavaScript component back out to the website to allow it to go out to our users' browsers. Dynatrace has never gone down a single time in the 18 months that we've been using it. It's never been down when we needed it, and it's always collected the data that we need to analyze when we go into it.

    Dynatrace itself doesn't really break, but the website does on occasion lose connectivity with a given API vendor, whether it's a payment processor or one of our other API pieces. Dynatrace is very good at alerting us when that happens, but we're not using any sort of self- healing capacity for that. What we do is we get alerted when those APIs aren't available or when a web page has an issue. My team gets alerts on a 24/7 pager basis, then we go in and investigate to resolve them.

    The solution has given me a better view into uptime, in terms of how much downtime our website has. However, the Salesforce Commerce Cloud solution is remarkably solid. We rarely have downtime. Even during the holiday season for Black Friday and Cyber Monday, we've had zero percent downtime for the last two years. From one perspective, it does let us know when the site is down using the synthetic monitor. The good news is that it has not had to give us much data because the site just doesn't go down.

    What do I think about the scalability of the solution?

    We have 96 users logging into Dynatrace right now.

    Our primary eCommerce environment is Salesforce Commerce Cloud. That is where we have the agent list SaaS solution implemented. We also have internal API servers within Azure, where we've implemented user agents to track them. We have not encountered any limitations in scaling to cloud-native environments with Salesforce Commerce Cloud. We've had remarkably smooth deployments.

    We have not scaled up to any other environments at this point.

    How are customer service and technical support?

    It is fantastic. You don't even have to pick up the phone. You can submit a Jira ticket from directly within Dynatrace to the support team. Those Jira tickets are categories based on the component of Dynatrace that you're looking at. You actually get a live agent chat within the tool, so you're not only submitting a ticket and getting a case number, but you have the support rep right there in the chat session to walk you through it. It's the only product I use that has a similar interactive of an online health system.

    Which solution did I use previously and why did I switch?

    We had AppMon, which is the previous version of their tool, before upgrading to Dynatrace. The first thing that we did was upgrade our on-prem AppMon solution to a solution that our Dynatrace agent setup in our DMZ on the network, then we added user agents on each of our API servers. This has morphed, as of last October, into a SaaS agentless solution that we run through a JavaScript snippet on our website. Every page on our website has a bit of JavaScript with a tiny JavaScript module that deploys out to the browser. For every user who visits our website, Dynatrace then collects metrics on what those user's actions are during the session and gives us reporting tools so we can check performance on the website.

    We have used other monitoring applications, like SolarWinds and Gomez, in the past. However, they have all been replaced by Dynatrace at this point.

    How was the initial setup?

    From the infrastructure perspective, what we have installed is the user agent on our API servers. So, we have six API servers set up in an Azure load balancing pool. There are three active at any given time. It was super neat when we installed the agent, because we actually went out and had lunch after the installation. When we came back, Dynatrace had generated the Smartscape view of not just the API and the different services they connect to, but it had crawled our entire network and found everything that it recognized from SQL Server databases to .NET Servers and API services. All of that stuff showed up in a sidenav automatically without our having hands on anything whatsoever. It provides a good quick view in the morning when you come in and just flip to that view right away, because it will flash in "red" for any given service or platform that is having a problem, then you can zoom to that problem and look into it right away.

    What was our ROI?

    Being able to capture every single user session on the site and work with our customer support team has been a huge return on investment for me. Of course, the additional support on top of Adobe Analytics to be able to try things, like website conversion, is also a huge return on investment.

    The time to diagnose has decreased on average by 34 percent in its implementation. That is primarily because Dynatrace not only alerts you when something goes wrong, but the Davis AI also gives you suggestions as to why it may have gone wrong. This gives you a head start on triage and resolution.

    Because the Davis AI gives us such a head start on problem identification, this leads into triage and diagnosis. Our diagnosis time has gone down significantly. We can find, identify, and get problems triaged more quickly than we could in the past by approximately 25 percent.

    What's my experience with pricing, setup cost, and licensing?

    The only limitation with scaling to cloud-native environments is licensing. It all depends on how many DEM units you're willing to license. The more of DEM units that you purchase, the more user data you can collect.

    Which other solutions did I evaluate?

    We did compare it with several other products in the market when we did our due diligence before purchasing an APM and SaaS solution. Dynatrace came out just leaps and bounds beyond the pack. We're very happy with the results we're getting with it today.

    We compared Dynatrace with AppDynamics, Opsgenie, and New Relic.

    What other advice do I have?

    We do not use the solution for dynamic microservices within a Kubernetes environment. It was on our development roadmap for this year, but I think COVID-19 has probably pushed it to next year. While it is something we will be doing, we're not doing it now.

    We have not yet integrated the solution with our CI/CD and/or ITSM tools, as it was on our roadmap for this year. We are a GitHub and Jenkins shop, and Dynatrace has plugins for both of those tools. One of the very next things we want to do with the tool is plug it into our CI/CD process so we can have sort of a hands-free built. We want to allow our builds to run through the entire pipeline and be managed by these three tools, then allow Dynatrace to do the reporting on the deployment and the resulting difference in the web application based on that new format.

    My advice would probably be to start with the SaaS implementation to get a feel for Dynatrace, what it does, and what it can deliver. Then, based on results with the SaaS platform, evaluate installing the onsite on-prem solution. They both have their advantages and disadvantages. They obviously work best when you use them together, but there are some instances where our firm does not need an on-prem solution and may need just the SaaS application. Vice versa, there may be some firms that just need the on-prem solution and don't need the SaaS cloud based solution. In my opinion, it is best to start with SaaS, then based on what you discover with SaaS, decide whether you need on-prem.

    I would give Dynatrace a solid eight (out of 10). It's beyond the expectations that I had when we purchased and installed it. As I went along and learned more about Dynatrace after the implementation, I was impressed with how much the tool does. Another aspect is not just how much it does, but how easy it is to do it. The AI engine runs 24/7/365, providing input. The dashboards make it super easy for my users to use as well as myself. 

    The analytics that it provides are very easy to read. You can present them in pie charts, bar charts, or single table data. There's just a myriad of ways you can display the data that you get from Dynatrace to make it more consumable for users. 

    Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
    ReinhardWeber
    Senior Product Manager at SAP CX
    Real User
    Top 20
    System updates, back fixes, or upgrades to the whole cluster have almost zero maintenance

    Pros and Cons

    • "Service engineers save a lot of time because they can just go in look at the data and share it with the customer, who has the same view, and say, "Here's an improvement which can be immediately implemented." It's not like a collection of big, multiple findings that are consolidated into one results presentation, then the customer needs to do something. It's more like a continuous performance analysis and improvement process, which is more efficient than those workshops approaches. That's one of the biggest of the advantages that our services team sees because it helps DevOps to focus on continuous delivery and shift quality issues to pre-production."
    • "Documentation could be improved. E.g., you don't know how to properly use Dynatrace because documentation is almost lacking behind the features being deployed."

    What is our primary use case?

    It's used in two major use cases:

    1. Monitoring and our own internal IT operations. 
    2. We provide our customers access to Dynatrace tenants so customers can also leverage developing their code running on our platform.

    It does full stack monitoring for internal operations, problem diagnostics, APM use cases, and performance management for our customers.

    We have multiple instances of Dynatrace running, where about half of them are running in our data centers and the other half are running in the public cloud. Therefore, it's a hybrid deployment. We use a mixture of cloud providers, including AWS, Microsoft Azure (running Kubernetes), and Google Cloud Platform.

    We have traditional deployments on VMware virtual machines as well as running stuff in the cloud. We have a couple hundred Kubernetes clusters monitoring using Dynatrace. Dynatrace's functionality in this area is unmatched combined with its full stack visibility, ease of deployment, and completely dynamic changes. The container environments are also dynamic since you have microservices spinning up and down as you go. I have never seen another tool doing this with the same reliability. 

    How has it helped my organization?

    Dynatrace has improved our organization through operational support. We also have a large services organization which directly works with customers, and sometimes you run into situations where customers ask how they can improve their applications. Traditionally, these service teams would go for assessments. Eventually, they would even go onsite and through performance workshops with them to find some low hanging fruits that could address, and this was very tedious work. By introducing Dynatrace, you suddenly have real-time data. Then, the process of doing performance reviews switches from workshops or a defined time frame analysis (and then taking actions) to a more continuous approach where you constantly have Dynatrace performance data of the landscape. 

    Service engineers save a lot of time because they can just go in look at the data and share it with the customer, who has the same view, and say, "Here's an improvement which can be immediately implemented." It's not like a collection of big, multiple findings that are consolidated into one results presentation, then the customer needs to do something. It's more like a continuous performance analysis and improvement process, which is more efficient than those workshops approaches. That's one of the biggest of the advantages that our services team sees because it helps DevOps to focus on continuous delivery and shift quality issues to pre-production.

    Dynatrace is tightly integrated with ITSM. It's integrated with ServiceNow, which our support team is using.

    We provide a platform, then the customer ships the code and deploys it. Therefore, we rely on testing by the customer, and sometimes, they miss something and it breaks. Then, it doesn't work as expected so we have to step in, and say, "Yes, your site is down," or "It's not functioning properly." We do the analysis because typically the customer says, "Okay, it's not us. It must be you as the service provider." This is where we gain a lot of efficiency. The support team is the first line of defense there. They get the information to determine if they are able to quickly pinpoint the problem. E.g., the customer deployed, then two hours later, issues were occurring. This is when you don't want to waste time. Our support engineers need the visibility so they can immediately be able to communicate to the customer, saying, "Yes, it's on our side," or "It's on your side." If it's on the customer's side, they can let them know exactly where they need to go. This is where we gain most of the time.

    It helps our operations that the solution uses a single agent for automated deployment and discovery. If you think about all the work in the past where we had different agents, tools, or scripts deployed to monitor specific aspects of an environment and different tools, then having one agent definitely helps. For example, for our rollout, when we migrated all the different tools to Dynatrace, we did this over the weekend. We installed the agent, then just watched the data and findings coming in, which was a huge benefit. We installed one thing an it discovers everything.

    I suppose the solution has decreased time to market for our individual customers with new innovations/capabilities. Dynatrace helps them gain better insights, allowing them to do another deployment faster.

    What is most valuable?

    It has auto detection of almost everything. The full stack capabilities to get one agent deployed allows you not to worry about anything else because the agent detects everything. This is in combination with the AI so you don't need to worry about any baselines or setting up any thresholds. This is all done automatically, which brings us the biggest benefit.

    Configuration as code integrating through APIs is really important when automating at scale. If you think about the tens of thousands of hosts that you deploy to, then APIs are key when automating deployments, the management of those instances, and configuration as well as integrating with other systems without sophisticated or far reaching APIs. 

    Dynatrace easily integrates with our infrastructure or applications, then reliably triggers self-healing actions or remediation actions. This is something that we really love to use because it definitely removes a lot of human interaction. You just let the machine to do the job and can trust it, and that's the most important. I have seen systems where the users were very reluctant to trust the system to take actions where typically a human would do the job manually. Dynatrace considers all the information that it gathers, then triggers self-healing actions which are quite reliable. It doesn't need a lot of human adjustment to make it work.

    We use real-user monitoring a lot to get insight into end users and our customers, e.g., customer behavior. 

    What needs improvement?

    While the integrations are great, sometimes our customers are not as far as long in Dynatrace concepts from a technical perspective as they need to be, whether it's a cultural thing and educational thing. Thus, some of our customers are not as advanced as Dynatrace would like them to be. From a technical perspective, all the capabilities are there but the concepts are not yet spread out within the ecosystem to their fullest extent. Therefore, Dynatrace is ahead of its time.

    Documentation could be improved. E.g., you don't know how to properly use Dynatrace because documentation is almost lacking behind the features being deployed.

    On very large deployment scenarios, the APIs for configuration and configuration management came in slowly. This is something that is good already but could be better.

    In the product, I am missing some configuration automation APIs.

    For how long have I used the solution?

    The company has been using Dynatrace on different occasions for the past eight years. The current product of Dynatrace has only been out for four years.

    What do I think about the stability of the solution?

    We operate services for our customers with pretty high SLAs. We guarantee the systems we run are reliable. We also guarantee uptime. In the past three years, we have run up to 50 updates with Dynatrace and had only one or two issues where the system had to be brought down. There are almost no issues at all with stability. It is rock-solid.

    They are improving constantly with every release and adding new stuff. We have updates about every two weeks.

    What do I think about the scalability of the solution?

    We have about 2,500 people using it.

    We currently manage seven Dynatrace clusters with several thousand Dynatrace tenants, then in total almost 30,000 hosts are monitored with Dynatrace. We're not reaching the limits of Dynatrace's scalability. This is probably one of the largest deployments, but we have not seen any limitations so far.

    We want to leverage even more services:

    • Real-user monitoring
    • Possibly look into session replay.
    • Expand the footprint of synthetic monitoring.
    • Build more integrations by leveraging all the data Dynatrace captures for custom metrics into our BI reporting, billing systems, internal cross charging functionality, and scaling/optimizing our environments in terms of resource usage. 

    There is a lot of data in Dynatrace at the moment that we do not fully utilize.

    How are customer service and technical support?

    The technical support is great. We have a pretty good contract with Dynatrace for contacting support. They are pretty responsive and very knowledgeable. You get a DevOps engineer from Dynatrace jumping on immediately with very high expertise. You don't get the typical Level 1 automated standard reply: "Yes, we will take care of it," but then you have to ping back.

    Which solution did I use previously and why did I switch?

    We came from a former product of Dynatrace, which was called AppMon, and not really sold anymore. Though, there are customers who still use it out there. We used it for the traditional APM scenario, then migrated to Dynatrace to extend the visibility for hybrid cloud deployment.

    We had been using a mixture of Opsview, Splunk, SolarWinds, and other tools. We switched because of the complexity of managing all these tools. It became unmaintainable. E.g., historically, people would write scripts for Nagios Opsview, then maintain them. If we lost the people who had been maintaining those scripts, then nobody knew how the checks worked for those custom scripts. Also, the maintenance overhead was pretty high.

    From the perspective of the end users using different monitoring solutions, you had different teams who had to go to different tools and contend with data in one tool not being exactly the same data as another tool. While the overlap between tools was there, the complexity in accessing those tools and knowing how to use those tools became a big organizational and maintenance overhead that we decided to pull them all into one tool to harmonize it. We wanted one tool where the interface and data are the same regardless of whatever you're monitoring.

    How was the initial setup?

    The initial setup was straightforward. We looked into Dynatrace and were able to roll it out to 12,000 hosts within four weeks. 

    From the Managed version, you can have it installed and up and running in less than an hour. This is on the condition that you have the hardware to install it on and access to the systems/services that you want to monitor.

    Initially, some people were skeptical about the one agent really working, so we did test it. Now, we have had so many good experiences that when we deploy, build new services, or spin up new instances, Dynatrace is one of the first things that is always there. We don't even even test the agents anymore. We completely rely on this mature product that is solid and stable when we deploy staging, development, QA environments, or playgrounds. There is no deployment without Dynatrace agents.

    What about the implementation team?

    We deployed Dynatrace ourselves as we have a lot experience working with it. Deploying Dynatrace depends on the environments that you run it on. Since that was all orchestrated with things like Puppet, Chef and Ansible for us, it just was a matter writing a bit of automation code that it wasn't already in place. One person was needed to do this properly, and it is not that hard of work because it applies to almost every environment that we deploy. For new services that we provide, it's done within the development teams writing those services. Therefore, there is no dedicated Dynatrace team responsible for integrating Dynatrace with services.

    There is almost an API for everything. If you run it Managed, this means you have to administer Dynatrace's installation yourself. You run it and take care of some prerequisites, like sizing. Any system updates, back fixes, or upgrades to the whole cluster have almost zero maintenance. All you need to do is confirm it or let Dynatrace update itself. In the past three years, we had almost 50 updates or installations where we didn't even need to touch anything. We just had one or two occasions where an update broke functionality, and those were fixed with next update and within hours. It's almost self-maintaining.

    We do have a dedicated staff for maintenance, but this team is not spending a lot of time on actually managing Dynatrace. They do the integrations of Dynatrace and other tools as well as development of custom integrations and configurations. This team is also responsible for the infrastructure and ensuring the machines Dynatrace runs on are scaled or adjusted properly. However, this is minor effort for them. We have a dedicated team of 20 to 30 SRE engineers and their responsibility is not only to Dynatrace. They are responsible for the whole infrastructure and surrounding tools.

    What was our ROI?

    As we use it internally, our internal operations have gained a lot more efficiency. The time to resolution and triage problems in different environments has been reduced by 50 percent, if not more. When Dynatrace raises a problem, the team does not need to bring together experts from other teams to look at the problem, log files, etc. You almost have Dynatrace training our support engineers because it's so easy to pinpoint the root cause of problems.

    The solution has decreased our mean time to identification by approximately 50 percent.

    There has been a positive impact on the instances run for our customers. Overall, uptime got better because we became faster at fixing the problems causing downtime.

    The solution has saved us money through the consolidation of tools. With a hybrid landscape, we had multiple tools. When we consolidated, we removed four or five other monitoring tools with one. For the last ROI calculation that I did, Dynatrace was saving us up to $500,000 per year. 

    In addition, our speed is up 40 to 50 percent. Therefore, our human cost and licensing savings together are one to two million.

    What's my experience with pricing, setup cost, and licensing?

    We are a very big customer. We obviously have a special price point. 

    If there are no corporate requirements to run Dynatrace Managed (operating it yourself), I would definitely go for the size option. For small and medium-sized companies, the size option is probably the cheapest one. You don't need to look into operating it. You don't need to run hardware. It is pay as you go. 

    We looked into what can Dynatrace could actually replace. If the price point is high, think about the impact it would have to the entire organization to constantly replace monitoring tools. If implemented correctly, then it has a lot of saving potentials for the organization. That is something that should go into any ROI calculation.

    Which other solutions did I evaluate?

    We looked at the other big player in this space: New Relic and AppDynamics. Looking at the cloud, full stack capabilities, ease of deployment, and scalability that Dynatrace has, they definitely stood out in comparison. The full stack story was pretty compelling, where you have one agent deployed and it provides everything.

    What other advice do I have?

    Trust what it's doing. Don't question what it's doing. If you don't understand it yet, take the time to try to understand it. Do not implement or force the old ways of monitoring onto a completely different approach, like Dynatrace. That's definitely that the biggest lesson a lot of people in our organization had to go through. 

    Be curious and embrace the different approach. It is definitely worth it. The different approach that it does is a good one. It's different but it's something that actually works. Those guys know what they have built and what they are doing.

    It is partly integrated with CI/CD. We are operating a platform with our applications, but our customers are responsible for testing and CI/CD deployed into our environments. Internally, some of our teams use it. The majority of our CI/CD deployment is our customers' responsibility, and while we do provide them Dynatrace for CI/CD, we do not control how they integrate it.

    We are in the process of rolling out synthetic monitoring at scale to replace other tools. 

    We are not yet using session replay, which is mostly due to data compliance restrictions. We have very hard data privacy protections. We do have customers who are highly interested in using the feature, but we are not using it at the moment.

    Overall, I would give the solution a clear 10 (out of 10).

    Which deployment model are you using for this solution?

    Hybrid Cloud
    Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor. The reviewer's company has a business relationship with this vendor other than being a customer: Partner.
    Buyer's Guide
    Download our free Dynatrace Report and get advice and tips from experienced pros sharing their opinions.