الثلاثاء، 28 يوليو 2015

Tuning Tomcat For A High Throughput, Fail Fast System

ProblemNetflix has a number of high throughput, low latency mid tier services. In one of these services, it was observed that in case there is a huge surge in traffic in a very short span of time, the machines became cpu starved and would become unresponsive. This would lead...
Share:

الجمعة، 24 يوليو 2015

Java in Flames

Java mixed-mode flame graphs provide a complete visualization of CPU usage and have just been made possible by a new JDK option: -XX:+PreserveFramePointer. We've been developing these at Netflix for everyday Java performance analysis as they can identify all CPU consumers and...
Share:

الثلاثاء، 14 يوليو 2015

Tracking down the Villains: Outlier Detection at Netflix

It’s 2 a.m. and half of our reliability team is online searching for the root cause of why Netflix streaming isn’t working. None of our systems are obviously broken, but something is amiss and we’re not seeing it. After an hour of searching we realize there is one rogue server...
Share: