Every IT manager’s secret dream: every SLA issue turns out to be “we just need more hardware.” No root-cause analysis, no war rooms, no heroic debugging sessions at 3 in the morning - just add servers and watch the graphs calm down. Capillaries was built with exactly that fantasy in mind.
Picture this: you’ve got a critical data processing job running every hour/day/month, and it absolutely must finish within a strict time window. But the data keeps growing a little with every run, and suddenly your comfortable safety margin starts looking more like a countdown timer.
Now imagine being able to fix the problem the easy way: spin up a couple more servers, lean back in your chair, and watch the processing times drop back into the green. That’s the kind of operational happiness Capillaries aims for. Just add more daemon instances and Cassandra nodes.
Capillaries is not a data exploration tool. It’s a workhorse. It shows up every day, does the same repetitive heavy lifting, and doesn’t complain about it. As long as the incoming data keeps following familiar patterns, Capillaries keeps delivering results with boringly predictable performance - which, in infrastructure terms, is about as close to happiness as it gets.
The best part: Capillaries daemon instances don’t need to coordinate their efforts. Daemon workers simply pull small pieces of work (batches) from the message queue, read the source data, process it, save the results to Cassandra, and mark the batch as complete.
Once an entire Capillaries script node (a data processing step) is finished, the worker marks the node as complete. When all nodes in the script are complete, the whole run - the entire job - is marked as done. In other words, the orchestration is handled by message queue and Cassandra. That duo quietly keeps everything moving forward and guarantees eventual job consistency.
Unlikely. True linear scalability is incredibly rare in distributed systems. It’s basically the unicorn of infrastructure engineering: everybody talks about it, but actually finding it is close to impossible. Capillaries performance measurements paint a more realistic picture - something much closer to logarithmic scalability.
Still, those measurements are extremely useful when you want to predict the performance of a cluster with, say, twice as many daemon instances and Cassandra nodes.
If you are into Cassandra reads/writes and memory/CPU stats, check out scalability numbers for specific AWS EC2 deployments with Prometheus graphs.