ACM Transactions on

Modeling and Performance Evaluation of Computing Systems (TOMPECS)

Latest Articles

On the Endurance of the d-Choices Garbage Collection Algorithm for Flash-Based SSDs

Garbage collection (GC) algorithms for flash-based solid-state drives (SSDs) have a profound impact on its performance and many studies have focused... (more)

A New Framework for Evaluating Straggler Detection Mechanisms in MapReduce

Big Data systems (e.g., Google MapReduce, Apache Hadoop, Apache Spark) rely increasingly on speculative execution to mask slow tasks, also known as... (more)

Packet Clustering Introduced by Routers: Modeling, Analysis, and Experiments

In this article, we investigate a router’s inherent variation on packet processing time and its effect on interpacket delay and packet clustering. We propose a simple pipeline model incorporating the inherent variation, and two metrics—one to measure packet clustering and one to quantify inherent variation. To isolate the effect of... (more)

Investigating Characteristics of Internet Paths

Interactive and multimedia applications depend on the stability of end-to-end paths for predictable performance and good quality of service. On the other hand, network providers depend on multiple paths to ensure fault tolerance and use load balancing between these paths to enhance the overall network throughput. In this study, we analyze path... (more)

Scheduling for Optimal File-Transfer Delay using Chunked Random Linear Network Coding Broadcast

We study the broadcast transmission of a single file to an arbitrary number of receivers using... (more)

Generalization of LRU Cache Replacement Policy with Applications to Video Streaming

Caching plays a crucial role in networking systems to reduce the load on the network and is commonly employed by content delivery networks (CDNs) to... (more)


Call for Nominations

The second term of the current co-Editors-in-Chief (EiCs) of TOMPECS is coming to an end, and the ACM Publications Board has set up a nominating committee to assist the Board in selecting the next EiC.

Nominations, including self-nominations, are invited for a three-year term as TOMPECS EiC, beginning on January 1, 2020. 



ACM Transactions on Modeling and Performance Evaluation of Computing Systems (ToMPECS) is an ACM journal that publishes refereed articles on all aspects of the modeling, analysis, and performance evaluation of computing and communication systems.


Forthcoming Articles
A Lyapunov approach for Time Bounded Reachability of CTMCs and CTMDPs

Time bounded reachability is a fundamental problem in model checking continuous-time Markov chains (CTMCs) and Markov decision processes (CTMDPs) for specifications in continuous stochastic logics. It can be computed by numerically solving a characteristic linear dynamical system but the procedure is computationally expensive. We take a control-theoretic approach and propose a reduction technique that finds another dynamical system of lower dimension (number of variables), such that numerically solving the reduced dynamical system provides an approximation to the solution of the original system with guaranteed error bounds. Our technique generalises lumpability (or probabilistic bisimulation) to a quantitative setting. Our main result is a Lyapunov function characterisation of the difference in the trajectories of the two dynamics that depends on the initial mismatch and exponentially decreases over time. In particular, the Lyapunov function enables us to compute an error bound between the two dynamics as well as a convergence rate. Finally, we show that the search for the reduced dynamics can be computed in polynomial time using a Schur decomposition of the transition matrix. This enables us to efficiently solve the reduced dynamical system by computing the exponential of an upper-triangular matrix characterizing the reduced dynamics. For CTMDPs, we generalise our approach using piecewise quadratic Lyapunov functions for switched affine dynamical systems. We synthesise a policy for the CTMDP via its reduced-order switched system that guarantees the time bounded reachability probability lies above a threshold. We provide error bounds that depend on the minimum dwell time of the policy. We show the efficiency of the technique on examples from queueing networks, for which lumpability does not produce any state space reduction and which cannot be solved without reduction.

On the role of clustering in Personalized PageRank estimation

Personalized PageRank (PPR) is a measure of the importance of a node from the perspective of another (we call these nodes the target and the source, respectively). PPR has been used in many applications, such as offering a Twitter user (the source) recommendations of who to follow (targets deemed important by PPR); additionally, PPR has been used in graph-theoretic problems such as community detection. However, computing PPR is infeasible for large networks like Twitter, so efficient estimation algorithms are necessary. In this work, we analyze the relationship between PPR estimation complexity and clustering. First, we devise algorithms to estimate PPR for many source/target pairs. In particular, we propose an enhanced version of the existing single pair estimator Bidirectional-PPR that is more useful as a primitive for many pair estimation. We then show that the common underlying graph can be leveraged to efficiently and jointly estimate PPR for many pairs, rather than treating each pair separately using the primitive algorithm. Next, we show the complexity of our joint estimation scheme relates closely to the degree of clustering among the sources and targets at hand, indicating that estimating PPR for many pairs is easier when clustering occurs. Finally, we consider estimating PPR when several machines are available for parallel computation, devising a method that leverages our clustering findings, specifically the quantities computed in situ, to assign tasks to machines in a manner that reduces computation time. This demonstrates that the relationship between complexity and clustering has important consequences in a practical distributed setting.

AMIR: Analytic Method for Improving Responsiveness by Reducing Burstiness

Service demand burstiness or serial correlations in resource service demands has previously been shown to have an adverse impact on system performance metrics such as response time. This paper proposes AMIR, an Analytic Method for Improving Responsiveness by reducing burstiness. AMIR considers multiple queue systems that service multiple classes of sessions, i.e., sequences of requests. Given the per-class service demand distributions and the number of sessions belonging to each class, AMIR decides an ordering of sessions, i.e., a schedule, that minimizes burstiness at the bottleneck and hence improves system responsiveness metrics including mean session wait time and total schedule processing time. A key aspect of the technique is an order O schedule burstiness metric ²O, which represents the mean probability that O+1 consecutive sessions in the schedule have resource demands at the bottleneck greater than the mean bottleneck demand of the schedule. For a given O, AMIR uses Integer Linear Programming (ILP) to produce schedules that progressively minimize ²i ? i  {1,...O}. Extensive simulation results show that AMIR significantly outperforms baseline policies such as Shortest Job First (SJF) and random scheduling with respect to metrics such as mean session wait time and total processing time. Furthermore, it offers more predictable performance compared to the other policies. Our results also provide insights on the conditions under which AMIR is most effective. To the best of our knowledge, for the types of systems we consider, we are not aware of other techniques that are designed to improve performance by explicitly minimizing high order burstiness.

A Framework for Allocating Server Time to Spot and On-demand Services in Cloud Computing

Cloud computing delivers value to users by facilitating their access to servers in periods when their need arises. An approach is to provide both on-demand and spot services on shared servers. On-demand users have high priority to access servers at a fixed price and occupy different periods of servers. Spot users bid prices for the remaining unoccupied periods; however, without appropriate design, such periods may be arbitrarily small due to the random arrivals and departures of on-demand users. This is also the current service model adopted by Amazon Elastic Cloud Compute. In this paper, we provide the first integral framework for sharing the time of servers between on-demand and spot services while optimally pricing spot service. It guarantees that on-demand users can get served quickly while spot users can stably utilize servers for a properly long period once accepted, which is a key feature to make both on-demand and spot services accessible. Simulation results show that, by complementing the on-demand market with a spot market, a cloud provider can improve revenue by up to 464.7%. The framework is designed under assumptions which are met in real environments. It is a new tool that other cloud operators can use to quantify the advantage of a hybrid spot and on-demand service, eventually making the case for operating such service model in their own infrastructures.

Consistent Sampling of Churn Under Periodic Non-Stationary Arrivals in Distributed Systems

Characterizing user churn has become an important research area of networks and distributed systems, both in theoretical analysis and system design. A realistic churn model, often measured using periodic observation, should replicate two key properties of deployed systems -- 1) the arrival process; and 2) the lifetime distribution of participating agents. Assuming a stationary arrival process, previous work shows that consistent (i.e., asymptotically accurate) estimation of the lifetime distribution is possible; however, the problem remains open for non-stationarity cases. Questions include what distributions these methods sample when the assumptions on the arrival process are violated, under what conditions consistency is possible with existing techniques, and what avenues exist for improving their accuracy and overhead. To investigate these issues, we first use random-measure theory to develop a novel churn model that allows rather general non-stationary scenarios and even synchronized joins (e.g., flash crowds). We not only dispose with common assumptions, such as existence of arrival rate and ergodicity, but also show that this model can produce all metrics of interest (e.g., sampled lifetime distributions, bandwidth overhead) using simple expressions. We apply these results to study the accuracy of prior techniques and discover that they are biased unless user lifetimes are exponential or the arrival measure is stationary. To overcome these limitations, we then create a new lifetime-sampling technique that remains asymptotically robust under all periodic arrival measures and provide a methodology for undoing the bias in the sampled arrival rate created by missed users. We demonstrate that the proposed approach exhibits accuracy advantages and 1-2 orders of magnitude less bandwidth consumption compared to the alternatives. We finish by implementing the proposed framework and applying it to experimental data from massive crawls of Gnutella.

Reviewer Acknowledgement

All ACM Journals | See Full Journal Index

enter search term and/or author name