Cloud storage services such as Dropbox, OneDrive, Google Drive, and iCloud Drive provide users with a convenient and reliable way to store and share data from anywhere, on any device, and at any time. The cornerstone of these services is the data synchronization sync operation which automatically maps the changes in users' local filesystems to the cloud via a series of network communications in a timely manner. On the other hand, without careful design and implementation, the data sync mechanisms could generate overwhelming data sync traffic, causing tremendous financial overhead and performance penalties to both service providers and end users. This paper addresses a simple yet critical question: Is the current data sync traffic of cloud storage services efficiently used? We first define a novel metric named TUE to quantify the Traffic Usage Efficiency of data synchronization. Then, by conducting comprehensive benchmark experiments and reverse engineering the data sync processes of eight widely used cloud storage services, we uncover their manifold practical endeavors for optimizing the TUE, including three intra-file approaches (i.e., compression, delta sync, and interrupted transfer resumption), two cross-file/-user approaches (i.e., deduplication and peer-assisted offloading), two batching approaches (i.e., file bundling and sync deferment), and two web-specific approaches (i.e., thumbnail views and dynamic content loading). Our measurement results reveal that a considerable portion of the data sync traffic is in a sense wasteful, and can be effectively avoided or significantly reduced via carefully designed data sync mechanisms. Most importantly, our study of TUE not only offers practical, actionable guidance for providers to build more efficient, traffic-economic services, but also helps end users pick appropriate services that best fit their use cases and budgets.
Parallel server frameworks are widely deployed in modern large-data processing applications. Intuitively, splitting and parallel processing of the workload provides accelerated application response times and scaling flexibility. Such frameworks include MapReduce, Hadoop, and Spark. For many applications, the dynamics of such systems are naturally captured by a Fork-Join (FJ) queuing model, where incoming jobs are split into tasks each of which is mapped to exactly one server. When all the tasks that belong to one job are executed, the job is reassembled and leaves the system. We consider this behavior at the output as a synchronization constraint. In this paper, we study the performance of such parallel systems for different server properties, i.e., work-conservingness, phase-type behavior, and as suggested by recent evidence, for bursty input job arrivals. We establish a large deviations principle for the steady-state job waiting times in an FJ system based on Markov-additive processes. Building on that, we present a performance analysis framework for FJ systems and provide computable bounds on the tail probabilities of the steady-state waiting times. We validate our bounds using estimates obtained through simulations. In addition, we define and analyze provisioning, a flexible division of jobs into tasks, in FJ systems. Finally, we use this framework together with real-world traces to show the benefits of an adaptive provisioning system that adjusts the service within an FJ system based on the estimates of the arrival intensity.
We consider the task of content replication in distributed content delivery systems used by Video-on-Demand (VoD) services with large content catalogs. The prior work in this area focuses on the setting where each request is generated independent of all past requests. Motivated by the fact that most popular VoD services offer recommendations to users based on their viewing history, in a departure from existing studies, we study the setting with time-correlation in requests coming from each user. We use a Markovian process to model each users request process. In addition to introducing time-correlation in user requests, our model is consistent with empirically observed properties of the request process for VoD services with recommendation engines. In the setting where the underlying Markov Chain is unknown and has to be learned from the very requests the system is trying to serve, we show that separating the task of estimating content popularity and using the estimates to design a static content replication policy is strictly sub-optimal. To prove this, we show that an adaptive policy, which jointly performs the task of estimation and content replication, outperforms all policies which separate the task of estimation and content replication.
Power market operators have recently introduced smart grid demand response (DR), in which electricity consumers regulate their power usage following market requirements. DR helps stabilize the grid and enables integrating a larger amount of intermittent renewable. Data centers provide unique opportunities for DR participation due to their flexibility in both workload servicing and power consumption. While prior studies have focused on data center participation in legacy DR programs such as dynamic energy pricing and peak shaving, this paper studies data centers in emerging DR programs, i.e., demand side capacity reserves. Among different types of capacity reserves, regulation service reserves (RSRs) are especially attractive due to their relatively higher value. This paper proposes EnergyQARE, the Energy and Quality-of-Service (QoS) Aware RSR Enabler, which enables data center RSR provision in real-life scenarios. EnergyQARE not only incorporates RSR market constraints, but also adaptively makes decisions based on workload QoS feedback and provides QoS guarantees. It modulates data center power through server power management techniques and server provisioning, handles a heterogeneous set of jobs, and considers transition time delay and energy loss of servers in order to reflect real-life scenarios. In addition, EnergyQARE searches for a near-optimal energy and reserve bidding market strategy in RSR provision. Simulated numerical results demonstrate that in a general data center scenario, EnergyQARE provides close to 50% of data center average power consumption as reserves to the market, and saves up to 44% in data center electricity cost, while still meeting workload QoS constraints. Case studies in this paper show that the percentages of savings are not sensitive to workload types or the size of the data center, although they depend strongly on data center utilization and parameters of server power states.
Stochastic models, based on M/G/1 queues and matrix analytic methods, are developed to compute the probability distribution of response times (i.e. data access times) in distributed storage systems protected by erasure coding. Using these models, different levels of redundancy are compared in terms of the reduction in mean, higher moments and quantiles of response time. Not only does more redundancy increase data protection but it also significantly flattens the tail of response times; this is demonstrated numerically at medium-high quantiles, up to the 99th. Tests against simulation in a broad range of setups support the accuracy of these models. Numerical results quantify the reduction in response time achieved by using two variants of canceling, which limits the additional load introduced by the redundancy.
Routing policies play a major role in the performance of communication networks. Backpressure-based adaptive routing algorithms where traffic is load-balanced along different routing paths on a per-packet basis have been studied extensively in the literature. Although backpressure-based algorithms have been shown to be network-wide throughput optimal, they typically have poor delay performance under light or moderate loads because packets may be sent over unnecessarily long routes. Further, backpressure-based algorithms have required every node to compute differential backlogs for every destination queue with the corresponding destination queue at every adjacent node, which is expensive given the large number of possible pairwise differential backlogs between neighbor nodes. In this paper, we propose new backpressure-based adaptive routing algorithms that only use shortest-path routes to destinations when they are sufficient to accommodate the given traffic load, but the proposed algorithms will incrementally expand routing choices as needed to accommodate increasing traffic loads. We show analytically by means of fluid analysis that the proposed algorithms retain network-wide throughput optimality, and we show empirically by means of simulations that our proposed algorithms provide substantial improvements in delay performance. Our evaluations further show that in practice, our approach dramatically reduces the number of pairwise differential backlogs that have to be computed and the amount of corresponding backlog information that has to be exchanged because routing choices are only incrementally expanded as needed.