ACM DL

ACM Transactions on

Modeling and Performance Evaluation of Computing Systems (TOMPECS)

Menu
Latest Articles

Selecting the Top-Quality Item Through Crowd Scoring

EQ: A QoE-Centric Rate Control Mechanism for VoIP Calls

NEWS

About TOMPECS

ACM Transactions on Modeling and Performance Evaluation of Computing Systems (ToMPECS) is a new ACM journal that publishes refereed articles on all aspects of the modeling, analysis, and performance evaluation of computing and communication systems.

The target areas for the application of these performance evaluation methodologies are broad, and include traditional areas such as computer networks, computer systems, storage systems, telecommunication networks, and Web-based systems, as well as new areas such as data centers, green computing/communications, energy grid networks, and on-line social networks.

Issues of the journal will be published on a quarterly basis, appearing both in print form and in the ACM Digital Library. The first issue will likely be released in late 2015 or early 2016.

READ MORE
Forthcoming Articles
RAPL in Action: Experiences in Using RAPL for Power Measurements

For improving energy efficiency and conforming with the power budget, it is important to be able to measure power consumption of cloud computing servers. Intel's Running Average Power Limit (RAPL) interface is a powerful tool for this purpose. RAPL provides power limiting features and accurate energy readings for CPUs and DRAM which are easily accessible through different interfaces on large distributed computing systems. Since its introduction, RAPL has been used extensively in power measurement and modeling. However, the advantages and disadvantages of RAPL have not been well investigated yet. To fill up this gap, we conduct series of experiments to disclose the underlying strengths and weaknesses of RAPL interface by using both customized microbenchmarks and three well-known application level benchmarks Stream, Stress-ng and ParFullCMS. Moreover, to make the analysis as realistic as possible, we leverage a production-level power measurement dataset from the Taito, a supercomputing cluster of the Finnish Center of Scientific Computing (CSC). Our results illustrate different aspects of RAPL and document the findings through a comprehensive analysis. Our observations reveal that RAPL readings are highly correlated with plug power, promisingly accurate enough and have negligible performance overhead. Experimental results suggest RAPL can be a very useful tool to measure and monitor energy consumption of servers without deploying any complex power meters. We also show that there are still some open issues such as driver support, non-atomicity of register updates and unpredictable timings that might weaken the usability of RAPL in certain scenarios. For such scenarios, we pinpoint solutions and workarounds.

Disk Prefetching Mechanisms for Increasing HTTP Streaming Video Server Throughput

Most video streaming traffic is delivered over HTTP using standard web servers. While traditional web server workloads consist of requests that are primarily for small files that can be serviced from the file system cache, HTTP video streaming workloads often service a long tail of large infrequently requested videos. As a result, optimizing disk accesses is critical to obtaining good server throughput. In this paper we explore serialized, aggressive disk prefetching, a technique which can be used to improve the throughput of HTTP streaming video web servers. We identify how serialization and aggressive prefetching affect performance and, based on our findings, we construct and evaluate Libception, an application-level shim library that implements both techniques. By dynamically linking against Libception at runtime, applications are able to transparently obtain benefits from serialization and aggressive prefetching without needing to change their source code. In contrast to other approaches that modify applications, make kernel changes, or attempt to optimize kernel tuning, Libception provides a portable and relatively simple system in which techniques for optimizing I/O in HTTP video streaming servers can be implemented and evaluated. We empirically evaluate the efficacy of serialization and aggressive prefetching both with and without Libception, using three web servers (Apache, nginx and the userver) running on two operating systems (FreeBSD and Linux). We find that, by using Libception, we can improve streaming throughput for all three web servers by at least a factor of 2 on FreeBSD and a factor of 2.5 on Linux. Additionally, we find that with significant tuning of Linux kernel parameters, we can achieve similar performance to Libception by globally modifying Linuxs disk prefetch behaviour. Finally, we demonstrate Libceptions potential utility for improving the performance of other workloads by using it to reduce the completion time for a microbenchmark involving two applications competing for disk resources.

An Experimental Performance Evaluation of Autoscalers for Complex Workflows

Elasticity is one of the main features of cloud computing allowing customers to scale their resources based on the workload. Many autoscalers have been proposed in the past decade to decide on behalf of cloud customers when and how to provision resources to a cloud application based on the workload utilizing cloud elasticity features. However, in prior work, when a new policy is proposed, it is seldom compared to the state-of-the-art, and is often compared only to static provisioning using a predefined QoS target. This reduces the ability of cloud customers and of cloud operators to choose and deploy an autoscaling policy as there is seldom enough analysis on the performance of the autoscalers in different operating conditions and with different applications. In our work, we conduct an experimental performance evaluation of autoscaling policies, using as application model workflows, a commonly used formalism for automating resource management for applications with well-defined yet complex structures. We present a detailed comparative study of general state-of-the-art autoscaling policies, along with two new workflow-specific policies. To understand the performance differences between the seven policies, we conduct various forms of pairwise and group comparisons. We report both individual and aggregated metrics. As many workflows have deadline requirements on the tasks, we study the effect of autoscaling on workflow deadlines. Additionally, we look into the effect of autoscaling on the accounted and hourly-based charged costs, and evaluate performance variability caused by the autoscaler selection for each group of workflow sizes. Our results highlight the trade-offs between the suggested policies, how they can impact meeting the deadlines, and how they perform in different operating conditions, thus enabling a better understanding of the current state-of-the-art.

An Empirical Analysis of Amazon EC2 Spot Instance Features Affecting Cost-effective Resource Procurement

Many cost-conscious public cloud workloads (tenants) are turning to Amazon EC2s spot instances because, on average, these instances offer significantly lower prices (up to 10 times lower) than on-demand and reserved instances of comparable advertized resource capacities. To use spot instances effectively, a tenant must carefully weigh the lower costs of these instances against their poorer availability. Towards this, we empirically study four features of EC2 spot instance operation that a cost-conscious tenant may find useful to model. Using extensive evaluation based on both historical and current spot instance data, we show shortcomings in the state-of-the-art modeling of these features that we overcome. Our analysis reveals many novel properties of spot instance operation some of which offer predictive value while others do not. Using these insights, we design predictors for our features that offer a balance between computational efficiency (allowing for online resource procurement) and cost-efficacy. We explore case studies wherein we implement prototypes of dynamic spot instance procurement advised by our predictors for two types of workloads. Compared to the state-of-the-art, our approach achieves (i) comparable cost but much better performance (fewer bid failures) for a latency-sensitive in-memory Memcached cache, and (ii) an additional 18% cost-savings with comparable (if not better than) performance for a delay-tolerant batch workload.

Special issue: Selected paper from the 8th ACM/SPEC International Conference on Performance Engineering (ICPE 2017)

All ACM Journals | See Full Journal Index

Search TOMPECS
enter search term and/or author name