Why Database Software “Wasted” Hardware

Databases are software, but they depend on hardware. While the database itself is software code that has been developed and deployed to classify, aggregate, control, and then manage various types of data, the foundation of any database is hardware. This reality is (obviously, of course) due to the fact that databases usually rely on servers, which are basically just hardware.

A database server is located in a box in an organization’s data room or on-premises data center. Alternatively, it sits on a “blade” in a larger-scale data center facility that customers tap into for on-demand cloud computing services. Optionally, the database straddles the world of on-premises and public cloud data centers, in a hybrid combination of the two.

The point is, whatever form, shape, and type of database we choose to use, the software that drives it fundamentally depends on whether there is a piece of hardware anywhere. find.

How material waste occurs

If we accept that all of the above household truths are like this, we could also assume that the database software should also be smart enough to know what data to put where, when and why, right? Yes obviously database software is smart which is why there are so many. But with so much to do … and with so much data to serve … and with so many different ways to create database query applications, not all databases put their shoe cabinet away (that is, i.e. store their data) and do not use the hardware they run. as effectively as the others. Databases written in high-level languages ​​(like Java) communicate with the machine through an intermediary, so they necessarily leave a certain amount of server performance on the table – and this is without doubt “wasted” in the context of this discussion.

Israeli-born NoSQL database company ScyllaDB believes it has a way to create databases that can be tidier, faster, better managed, and self-optimized. The company advocates a close-to-metal approach to its Scylla database, but what does that mean in simple terms?

In the database world, “close-to-the-metal” refers to database software that has intimate knowledge of the hardware it is running on (hardware RAM addresses and a “set of instructions ”wider, if you want to be technical). The “privacy” of proximity means that the database can draw more power from the server hardware it is running on. The tradeoff is less flexibility and a degree of “locking” of the hardware in question, but this precision engineering delivers greater speed. So how fast?

Millions and billions

Dor Laor’s CEO of ScyllaDB explains that the Scylla database can perform millions of operations per second (OPS) on a single node (a node can mean a lot of things in IT, but in this case it is a server). The company cites independent tests that show a cluster of Scylla database servers reading 1 billion rows per second (RPS). The total sum here is more speed and more power from each server. This technological proposition will not be the most efficient route for every use case – a smaller dataset can use a traditional relational database – but for big data database workloads with a large number of data points, Scylla agrees.

“For 99.9% of applications, Scylla delivers all the power a customer will need, on workloads that other NoSQL databases cannot touch, and at a fraction of the cost of an in-memory solution.

Dor Laor, CEO of ScyllaDB.

CEO Laor says Scylla is a good choice for high throughput (i.e. (i.e. with very little lag) software applications. Scylla is also good with high density nodes, i.e. that is, servers that contain a huge amount of data.

Node sprawl occurs when you use DB instances in multiple locations to segment and allocate different information workloads in different areas. Higher density nodes mean there’s a lot of data in one place, which is convenient, but a bigger chunk to chew on all the time.

The company is now looking to provide what it calls “high density media,” which means that old pieces of data that are fragmented in different temporary storage areas are moved to long-term storage nodes where they can. to reside more comfortably, with more precision… and for longer. This means being able to buy less overall storage in the short term.

Comcast tunes in Scylla

Phil Zimich, senior director of software development and engineering at Comcast, explains his business shift from the Cassandra database to Scylla. The company uses its X1 platform to drive firmware on devices to upgrade them for future TV and voice services it wants to provide. Comcast manages 31 million devices in 15 million homes, all managed at an individual account level. There are 21 different web services in the Comcast X1 Scheduler that provide recordings to users when they want them. Users should also be able to cancel or edit records and receive reminders… all of this takes 25 million account calculations per day, so this is the kind of use case Scylla was created for. Comcast has gone, following its migration to Scylla, from 962 knots to 78 today.

So that the Scylla database can do what it does and “waste” less hardware, it makes the most of all the processing power (CPU) and RAM available in the given computing environment. An additional technique here is the use of something called incremental compaction. This is the process by which data is updated and deleted so that an organization can take advantage of the most efficient method of storage. Different compaction processes run on different compaction strategies. But ultimately, an inefficient compacting process could see the database reserving up to 50% of its space for this process, rather than using it more efficiently and devoting it to actual storage.

“It’s also important to remember that the more data you write to a database, the more you create a ‘data debt’ in terms of data that has to queue for analysis and / or final storage. So, if this scenario results in too much data in the queue, the database itself suffers a performance loss. We know that different workloads will exist for the database, so the database must be able to prioritize the processing of more critical real-time processes over less critical ones, ”said Laor, CEO of Scylla.

So, have we really been “wasted” in terms of hardware usage by our software in the past? In some cases, yes … and these will now manifest more frequently in data consumption cases where the need to scale up (sometimes at scale) comes to the fore. Think about your smart heating, air conditioning or security system for your home with its home sensors and cameras. These types of apps usually offer a day of video playback snapshots for free, but users have to pay for 30 days or more. When people start signing up for these services in droves, the real challenge of scaling arises.

“The servers will fail. So if your software system is spread across 365 server nodes then let’s say there is a potential for failure of one node per day. If your software system is spread across 10 nodes in a massively denser environment, the average time between failures of each server is logically less because the server farm is smaller. Failures always happen, but with modern software tools to help with backup and restore, the whole process can be managed in a more tightly controlled environment, ”said Laor.

In the past, we didn’t always think about working to optimize systems because another processor would come along and provide such a large incremental increase in processing power that it was not worth the human developer’s time. Now, with Moore’s Law on its deathbed, it is more important to think about creating more efficient and less expensive software architectures.

Alarm clock for all efficiencies?

Would this record arrive at the right time? After all, Gen Z is ‘awake’ enough to ensure that we are tackling climate change and global waste in ways we might not have envisioned a decade ago. Maybe it’s only fitting that we also think about the hardware consumption habits of software and make that more environmentally friendly?

While this doesn’t quite reduce the carbon footprint in the same way as reducing air travel and recycling plastic bags, more efficient use of any resource is ultimately still a good thing, sure.

Maria H. Underwood