GremlinInc Profile Banner
Gremlin Profile
Gremlin

@GremlinInc

Followers
5K
Following
3K
Media
2K
Statuses
5K

The Reliability Management Platform for high-velocity engineering teams

San Jose, CA
Joined January 2016
Don't wanna be here? Send us removal request.
@GremlinInc
Gremlin
1 month
Thanks for listing us as a startup on the rise in 2025, @IBTimes! . We've been heads down building and we can't wait to share more about what's coming this year. and beyond 🚀.
Tweet card summary image
ibtimes.com
AI is moving at speeds no one could have anticipated, leaving the platform and engineering teams on the reactive. Below, we highlight 5 startups that are growing fast, improving reliability, provid...
0
1
3
@GremlinInc
Gremlin
45 minutes
Are your LLM-powered tools resilient to failures in upstream services, data, or infrastructure?.
0
0
0
@GremlinInc
Gremlin
1 day
"We can spend one hour a week up front, or once a month [dealing with reliability], or we could lose two days to dealing with it." -Kolton Andrus, Gremlin CEO
0
1
2
@GremlinInc
Gremlin
2 days
When it comes to your Reliability Posture Management, do you have plan or do you have a wish?
0
1
1
@GremlinInc
Gremlin
5 days
Too often, we equate reliability with uptime %. But when something breaks (and it will), what matters is how your system- & your team- responds. Resilience is about graceful degradation, fast detection, & clear mitigation paths. That’s where real reliability work happens.
0
0
0
@GremlinInc
Gremlin
6 days
Investing in proactive reliability efforts saves time and money, allowing your teams to focus on what truly matters. Learn more about the ROI of reliability right here:
0
0
0
@GremlinInc
Gremlin
7 days
"��The CTO is responsible for the quality of the code that you're writing, the quality of the customer experience, the quality of the product." -Kolton Andrus, Gremlin Founder & CEO
0
1
1
@GremlinInc
Gremlin
8 days
AI workloads spike unpredictably—half of orgs saw 1,200% growth in AI traffic since July 2024. To meet demand reliably:.✅ Autoscale on queue size, not GPU usage.✅ Tune thresholds for latency.✅ Inject load to validate scaling rules.Learn more:
Tweet card summary image
gremlin.com
Demand for AI services is ever-increasing. Are your systems prepared? This blog teaches you how to prepare for sudden demand surges.
0
0
0
@GremlinInc
Gremlin
9 days
If your team had an extra 3 hours of downtime next quarter—where would it hurt the most?.
0
0
1
@GremlinInc
Gremlin
12 days
Australia’s @NAB -one of the “big four” banks- has over 25 teams using Gremlin for Chaos Engineering, achieving a 75% reduction in MTTR. With >45% of apps now in the cloud, resilience is baked into their stack. Learn more 👇.
Tweet card summary image
gremlin.com
NAB, Australia’s largest business bank and one of the ‘big four,’ kicked off its Technology Transformation program at the end of 2018 in pursuit of simplicity, agility, resilience and to stay...
0
0
0
@GremlinInc
Gremlin
13 days
Do you know the current reliability risk of your systems? Do you know right now how your services will react to common failures like a dependency going down? 📉. Read more in our latest post:
Tweet media one
0
1
1
@GremlinInc
Gremlin
14 days
Catch our CEO & Founder @KoltonAndrus on The Code Story podcast, covering: .🚀 The evolution of #ChaosEngineering .🚀 What orgs need to maximize the value of their proactive reliability efforts .🚀 The role & limitations of #AI in complex online systems .
0
1
1
@GremlinInc
Gremlin
15 days
1️⃣ Serverless ≠ reliable by default.2️⃣ You do control reliability- through retries, timeouts & error‑handling.3️⃣ These best practices level up any stack, not just serverless. Read more:
Tweet card summary image
gremlin.com
Serverless means not managing servers, but you still need to consider reliability. In our latest blog, learn why and how to build resilient serverless applications.
0
0
1
@GremlinInc
Gremlin
16 days
If your engineers are up at 3 a.m. fixing issues, that’s not a badge of honor- it’s a sign something's broken. In this episode of the Smooth Scaling podcast, Gremlin CEO Kolton Andrus shares why we need to stop celebrating firefighting and start rewarding prevention instead. 🛠️
0
1
1
@GremlinInc
Gremlin
19 days
It's not DNS. There's no way it's DNS. It was DNS. Gremlin helps orgs get ahead of global DNS outages by testing your network stack in advance, ensuring you can always fall back to your secondary DNS provider. So you can confidently say "It's not DNS.".
Tweet card summary image
blog.cloudflare.com
On July 14th, 2025, Cloudflare made a change to our service topologies that caused an outage for 1.1.1.1 on the edge, resulting in downtime for 62 minutes for customers using the 1.1.1.1 public DNS...
0
1
1
@GremlinInc
Gremlin
20 days
"We couldn’t have done this without Gremlin and the close working relationship we have with them.” . Read more about how Gremlin partnered with @Visa Cross-Border Solutions to build a culture of reliability across their organization:
Tweet card summary image
gremlin.com
Using Gremlin, Visa Cross-Border Solutions was able to standardize resilience testing in staging to create a culture of reliability that improved the resilience and availability of services across...
0
0
0
@GremlinInc
Gremlin
21 days
"Our digital infrastructure is going to be almost as important as our physical infrastructure. And when it fails, it's going to be a big deal. People are going to expect things to work to work fast and to work when they need them to." -Kolton Andrus, Gremlin Founder & CEO
0
1
0
@GremlinInc
Gremlin
22 days
Cloud outages happen- even with AWS, Azure, or GCP. To stay online, you need more than SLAs:.✅ Know your failover plan.✅ Test DNS + multi-region recovery.✅ Run real outage drills. Prep before disaster strikes:.
Tweet card summary image
gremlin.com
Check out these testing best practices teams should follow to minimize the impact of cloud provider outages so they don’t catch you by surprise.
0
1
1
@GremlinInc
Gremlin
23 days
Downtime is expensive- and preventable. See how much downtime costs your org- and what to do about it- at the link below.
0
0
0
@GremlinInc
Gremlin
26 days
See how SEPHORA migrated from a monolithic app to Kubernetes using Gremlin—and sailed through Black Friday & Cyber Monday with zero major outages. 💄🛒.
Tweet card summary image
gremlin.com
Gremlin helps the world’s leading prestige beauty retail brand smoothly migrate from monolithic to Kubernetes—and to pull off Black Friday and Cyber Monday without any major issues.
0
0
0
@GremlinInc
Gremlin
27 days
With the cost of downtime getting more & more expensive, it’s crucial to put time and effort into proactive reliability efforts, especially with the rise of AI. Learn how you can prepare for outages- before they cost millions- right here:
0
0
0