Power cost minimization in data centers Via Lyapunov optimization①

2016-12-29 05:34ZhangRanWanJianxiong
High Technology Letters 2016年4期
关键词:陇西县陈旧内生

Zhang Ran (张 然), Wan Jianxiong

(*School of Mathematics and Statistics of Central South University, Changsha 410083, P.R.China)(**School of Information Engineering, Inner Mongolia University of Technology, Hohhot 010051, P.R.China)



Power cost minimization in data centers Via Lyapunov optimization①

Zhang Ran (张 然)②*, Wan Jianxiong**

(*School of Mathematics and Statistics of Central South University, Changsha 410083, P.R.China)(**School of Information Engineering, Inner Mongolia University of Technology, Hohhot 010051, P.R.China)

With the advent of big data, the demand for computing has been increasing in a very large scale for the past decade, so geographically distributed data centers are erected in the direction of cloud computing development. A Lyapunov optimization approach is considered for the problem of minimizing energy cost for distributed Internet data centers (IDCs). By capturing the power cost of servers and cooling systems, the Lyapunov optimization technique is formulated to design a decisive strategy that offers provable power cost minimization and QoS guarantees. The algorithm performance and effectiveness are validated via simulations driven by real world traces.

power cost, quality-of-service (QoS), workload

0 Introduction

The increasing Internet services and cloud computing have stepped into people lives in recent years so that big data and computation are migrated to or hosted on the Internet data centers (IDCs). The total construction area of a huge IDC is 300,000 square meters, which can host more than one million high-performance servers. However, consumed energy in IDCs and its cost have been gradually out of control. Qureshi et al. discovered that many IDC operators consumed more than 10 million dollars on their annual electricity bills[1], so the research focuses on how to reduce energy consumption and electricity cost of IDCs.

Most previous jobs on power management paid close attention to how to reduce the total energy consumption. However, apart from the energy consumption, electricity price should be more concerned too since the electricity prices in western countries exhibit time and location diversities[2-5]. Although the above researches probed the energy cost spent in electricity from servers and practical applications, they ignored another aspect of energy cost, cooling system. Zhang designed and evaluated TEStore exploiting thermal and energy storage techniques to cut the electricity bill for data center cooling without causing servers in a data center to overheat[6]. Nevertheless, the study simply reduced the total cost in a coarse-grain pattern. Their research ignored the widen difference of server-temperatures across diverse server rackets and applications.

An expense minimization of IDC’s electricity power (EMIEP) problem is formulated in this paper to minimize time-averaged expected energy cost subject to QoS and average temperature constraints. Meanwhile, the research designs an algorithm leveraging the Lyapunov Optimization technique to approximately solve the EMIEP problem and use real workload trace from Ordos UniCloud Technology Co., Ltd. to simulate the above algorithm. Numerical results illustrate that the presented algorithm can reduce total energy cost as well as guarantee QoS and temperature constraints.

1 System model

This section discovers a system model shown in Fig.1 and formulates the EMIEP problem. IDC physically consists of rackets of servers and is logically made up of a number of applications, each of which schedules its servers to process the arriving service. All requests preparing to enter an application share the same workload queue and follow the rule that the requests in buffer will be scheduled to the currently idle server. Quality-of-service (QoS) requirement must be considered in each application.

The running servers generate massive waste heat inevitably. For the sake of the reliability of the servers, a computer room air conditioner (CRAC) is used to regulate the server temperatures by sucking cold air into the server racks, pushing waste heat out of the machine room, and recycling the inside air via the air chilling unit. Fig.1 exhibits the air flow of the system. The light arrow represents cold air and the brunet one symbolizes hot air exhausted from the racks. Apart from the above CRAC, an indoor air conditioner maintains the machine room temperature to predetermined Tsp.

Fig.1 Layout of IDC Applications

The whole industry is confronted with huge cost of energy consumption, including server energy consumption and cooling energy consumption. This paper formulates the problem of minimizing the total energy cost of the data center subject to QoS and server temperature constraint. The formulated model can be simply described as follows:

minserver power cost + cooling power cost

subject to:

QoS constraint, for each application

Average temperature constraint, for each server

This problem describes three control variables: number of servers for each application, cold air temperature and electricity price discussed in details in the following sections.

2 Problem formulation

2.1 Energy cost model

2.1.1 The server side

Let J be the total number of applications hosted in IDC. At time slot t, define p(t), ej(t), Lj(t), and mj(t) as the electricity price, energy consumption for a single server, the workload, and the number of servers for application j∈[1,…,J], respectively. Refs[7,8] presented a linear function to display the relationship between power consumption and the server load as follows:

(1)

where a1is the marginal energy consumption for CPU and a2denotes the server energy consumption except CPU. The total energy consumption of an application j is Ej(t)=mj(t)×ej(t)=a1Lj(t)+a2mj(t), and the total server energy cost in IDC is

(2)

2.1.2 The air chiller side

C(t)=cfρ(Tsp-Tc(t))

(3)

The energy cost is

PC(t)=p(t)×cfρ(Tsp-Tc(t))

(4)

Obviously, it is the sum of PS(t) and PS(t) that equals the total power cost of the data center.

2.2 Constraints

2.2.1 QoS constraint

(5)

2.2.2 Temperature constraint

At steady state, the temperature of server j can be controlled by the inlet cold air temperature Tc(t) and the CPU energy consumption ej(t):

(6)

where ζ is the heat exchange rate expressed in K·s/J. The processor and motherboard reliability of a server is mainly influenced by temperature gradient and thermal stress in a production data center. With respect to reliability issue, the expected server temperature must be maintained below a certain threshold Tmax. Plugging Eq.(1) into Eq.(6) yields:

(7)

2.3 Problem formulation

After some alternations of Eq.(5) and inequation(7), now define the EMIEP problem as follows:

(8)

subject to:

(9)

≤0,∀j, t (10)

Tmin≤Tc≤Tmax

The EMIEP problem cannot be easily solved due to the following reasons: 1) The unknown probability distribution of Lj(t) and p(t) makes the expectation in Eq.(8) and constraint (10) computationally troublesome; 2) Traditional methods to deal with dynamic optimization problems such as dynamic programming suffers from the curse of dimensionality, i.e., the computation complexity grows exponentially with problem size. Therefore, a modern method will alternate the old approach to approximately solve this problem.

3 A Lyapunov approach to solve EMIEP problem

This section first relaxes constraint (10) and uses the Lyapunov optimization theory to obtain an optimal solution.

3.1 Relaxing the EMIEP problem

Constraint (10) can be relaxed into

≤0,∀j (11)

The relaxation suggests that the expected server temperature is occasionally beyond the temperature bound without destroying reliability as long as the time-averaged expected temperature is within the acceptable range.

Replacing constraint (10) in the original problem by constraint (11) leads to an relaxed version of EMIEP. Constraint (11) can be further transformed into:

+a2mj(t))-Tmaxmj(t)}≤0

由于陇西县的绝大部分劳动力受长期自然经济的影响,小农意识浓厚,思想保守,满足于现状和眼前利益,缺乏发展致富的成就动机,加之大多数贫困劳动力持有小富即安的陈旧思想,致富意识淡薄,缺乏脱贫致富的内生动力,存在“等靠要”思想。

(12)

To satisfy constraint (12), EMIEP can be transformed into a queue stability problem. Define virtual queue Zj(t) with update Eq.(13).

Zj(t+1)=max{Zj(t)+mj(t)Tc(t)+ζ(a1Lj(t) +a2mj(t))-Tmaxmj(t),0}

(13)

3.2 The objective function

The Lyapunov optimization framework subtly designs a control algorithm that chooses actions for all t to yield a time average expectation of the objective function value close to optimal solution with the mean-stable virtual queue Zj(t). The algorithm changes the original problem into an alternative, minimizing the time average of a cost function subject to queue stability.

Let Zj(t) be a concatenated vector of all virtual queues with update Eq.(13). Define the Lyapunov function:

(14)

Define Δ(Ζ(t)) as the conditional Lyapunov drift for slot t:

(15)

where the expectation depends on the control policy and random workload arrivals.

Instead of taking control actions to directly minimize Eq.(8), the Lyapunov optimization seeks to minimize a bound of the following drift-plus-penalty function

Δ(Ζ(t))+VE{p(t)(E(t)+C(t))|Z(t))}

(16)

where V≥0 symbolizes an “importance weight” on how much the algorithm emphasizes cost minimization. In section 6, an algorithm minimizing Eq.(16) will achieve a close-to-optimal solution while stabilizing Ζ(t).

The drift-plus-penalty objective function Eq.(16) can be bounded by the following

Δ(Ζ(t))+VE{p(t)(E(t)+C(t))|Ζ(t))}

+ζ(a1Lj(t)+a2mj(t))-Tmaxmj(t))|Ζ(t))}

+VE{p(t)(E(t)+C(t))|Ζ(t))}

(17)

where B is a constant defined as B=mmaxTmax+ζ(a1Lmax+a2mmax). The designed algorithm minimizes the right-hand-side of Eq.(17).

3.3 Algorithm design

The objective function of problem Eq.(8) can be rewritten as

+cfρ(TSP-Tc)]

(18)

The existing cross term of control variables plagues the solution of the problem. However, fixing Lj(t) and p(t) can simplify the solution for mj(t) by creating a linear function. To see that, dropping the constant terms in Eq.(18) yields

+Vp(t)a2mj(t)}-Vp(t)cfρTc

(19)

Rearranging (19) yields

-Vp(t)cfρTc

(20)

Algorithm1 EMIEP:Choosingthebestmbestj(t)fortheDrift⁃Plus⁃PenaltyAlgorithm1:Determinetheupperboundofmj(t)asmmaxj2:Calculatetheminimumofmj(t):mminj(t)=1Dj+Lj(t)μ3:Studythecoefficientofmj(t):4:ifZj(t)(Tc+ζa2-Tmax)+Vp(t)a2>0then5: returnmminj(t)6:else7: returnmmaxj8:endif

The next step is to test all possible cooling air temperatures to find optimal Tc. This can be done via Algorithm 2.

Algorithm2 EMIEP:MinimizingtheDrift⁃Plus⁃PenaltyAlgorithm1:Define: interval:Setofdecisionepoch Temperature:Setofpossiblecoolingairtemperature. Applications:Setofapplications. Fobj:Thevalueofobjectfunction DPP[Tc]:ThevalueofLyapunovDrift⁃plus⁃PenaltyatTc2:forallt∈intervaldo3: forallTc∈temperaturedo4: CallAlg.1,acquirembestj(Tc)forallj5: Fobj←∑Jj=1p(Tc)(a1Lj(Tc)+a2mbestj(Tc))+p(Tc)∗c∗f∗ρ(TSP-Tc)6: DPP[Tc]←∑Jj=1Zj(Tc){mbestj(Tc)Tc+σ(a1Lj(Tc)+a2mbestj(Tc))-Tmaxmbestj(Tc)}+V×Fobj7:endfor8:minTc←argminTcDPP[Tc]9:mbest(t)←m[minTc]10:minFobj(t)←Fobj[minTc]11:minServerprice(t)←serverprice[minTc]12:minCoolingprice(t)←coolingprice[minTc]13:Zj(t+1)←max{Zj(t)+mbest(t)∗Tc(t)+ζ(a1∗Lj(t)+a2∗mbest(t))-Tmax∗mbest(t),0}14:endfor

4 Performance analysis

This section first shows that the minimum time-averaged IDC energy cost can be achieved using a randomized stationary control policy independent of the virtual queue state Ζ(t). Then, a performance bound for objective function is derived.

Define E{p(t)(Eπ(t)+Cπ(t))} as the expected energy cost in the interval [1,…,T] under control policy π, and e*as the minimum achievable E{p(t) (Eπ(t)+Cπ(t))} over all possible π. If problem Eq.(8) is feasible, then for any δ>0, there is a policy π*which depends only on workload and power price which satisfies

E{p(t)(Eπ*(t)+Cπ*(t))}≤e*+δ

This result is a direct application of Theorem 4.5 in Ref.[9], which discovers that there is aqueue-independent randomized stationary yielding an energy cost arbitrary close to the optimum as long as the problem is feasible. Next the performance bound is given.

Theorem 1 Suppose that E{L(Ζ(0))}<∞. The following results yield:

1) The achievable total energy cost of the EMIEP algorithm can be bounded by

(21)

2) The virtual queue Ζ(t) is mean rate stable, i.e.,

(22)

According to the above theorems, the article concludes the [O(1/V), O(V)] tradeoff, i.e., the energy cost can be pushed arbitrarily closed to the optimum as V→∞.

5 Numerical evaluations

This section conducts extensive simulations to evaluate the proposed EMIEP algorithm.

5.1 System configuration

All data used in research were accumulated from four applications of Ordos Uni-Cloud Co., Ltd., after the authors had spent a week on their Internet Data Center (IDC). Because electricity in EMIEP relates to two factors: server power expenditure and cooling system, workload trace involved in server power contains mean request arrival rate shown in Fig.2 for interactive web service at intervals per hour.

To analyze energy cost consumed in applications, machine room Tspis set to 25℃. In the light of Ref.[9], the algorithm acquires the heat capacity and the density of the air as c = 1005J/kg.K and ρ =1. 205kg/m3at 25℃. Without loss of generality, the service rate of single server μ is normalized as 1 across all applications. If the minimal and maximal power consumption of a single server is 40W and 80W, parameters in Eq.(1) can be set to a1=40 and a2=40. Suppose that heat exchange rate is ζ=0.625K·s/J, the air flow rate is f =5m3/s, and the maximum allowable server temperature is Tmax= 60℃.

Fig.2 Workload trace involved in server power

5.2 Result analysis

Based on the above set parameters, the performance of the EMIEP algorithm is investigated and compared with a greedy policy, which is obtained by solving the following problem:

minPS(t)+PC(t)

(23)

subject to:

Note that problem (23) greedily minimizes power cost in current slot rather than the long term average power cost.

5.2.1 Total energy cost and delay

Fig.3 shows that dynamic programming price (DPP) is also a random variable. At the initial state, its number of samples is not enough that the expectation of objective function is over-estimated, but as time goes by, more samples will be obtained, which leads to more accurate estimation.

Fig.3 Value of drift-plus-penalty objective function with V=10

Different vertical ordinates shown in Fig.4(a), 4(b) and 4(c) individually denote total, server and cooling energy costs V, the horizontal ordinate, controls the weight of Lyapunov drift and objective function. The experiment also plots the performance of greedy strategy in the figure, as shown in Fig.4(a) and 4(b), the total and server energy costs descend in steps with the increase of V. In this way, the objective function is efficiently optimized. Though the cooling energy cost mildly increases at V=108, V=109and V=1010shown in Fig.4(c), this situation has not affected the general tendency of energy cost as V grows. When V grows to V=1011, this EMIEP outperforms greedy strategy.

(a) Total energy cost vs. V

(b) Server energy cost vs. V

(c) Cooling energy cost vs. V

The QoS (delay) for each application in Fig.7 discovers that the 10ms QoS requirement is met in this EMIEP strategy. Fig.5 and Fig.6 show the number of servers allocated to each application with time and the variation of drift-plus-penalty objective function, respectively. It can be shown that this EMIEP can effectively minimize the system power cost while ensuring QoS requirement.

Fig.5 Servers allocated to each application with V=10

Fig.6 QoS (delay)

5.2.2 Cooling temperature and server temperature

Compared with Fig.4(c), the variation trend of cooling cost in Fig.7 goes to different way. The case explains the fact that the lower the temperature of cooling air is set, the higher the cooling power consumption and cost.

Fig.7 Cooling air temperature vs. V

Since energy consumption directly influences server temperature, the server temperatures as shown in Fig.8 from four applications constantly keep heading up but occasionally fluctuate at V=109and V=1010in application 3 and 4. Raising V to 1011slightly increasing the average temperature, significant power cost savings can be obtained.

Fig.8 Server temperature vs. V

6 Conclusion

This work concentrates on greening the data center by minimizing time-averaged expected energy cost subject to QoS and average temperature constraints.

Since the workload distribution cannot be obtained in advance, it is challenging to design a dynamic control algorithm to achieve this goal.

To address this opinion, the algorithm leverage the Lyapunov Optimization technique and develop an algorithm to approximately solve the Expense Minimization of IDC’s Electricity Power problem. By evaluating the simulation experiment, the algorithm can be pushed arbitrarily close to optimal solution as control parameter V is raised. Based on real workload trace from Ordos Uni-Cloud Technology Co., Ltd.to simulate the proposed algorithm, numerical results illustrate that the algorithm can practically reduce total energy cost while guaranteeing QoS and temperature constraints.

[1] Qureshi A, Weber R, Balakrishnan H, et al. Cutting the electric bill for internet-scale systems. ACM SIGCOMM Computer Communication Review, 2009, 1:123-134

[2] Dou H, Qi Y, Wang P J, et al. Hybrid power control and electricity cost management for distributed internet data centers in cloud computing. In: Proceedings of the 10th International Conference on Web Information System and Application, Yangzhou, China, 2013. 394-399

[3] Yao J G, Liu X, He W B, et al. Dynamic control of electricity cost with power demand smoothing and peak shaving for distributed internet data centers. In: Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, Macau, China, 2012, 67: 416-424

[4] Wang C, Urgaonkar B, Wang Q, et al. A hierarchical demand response framework for data center power cost optimization under real-world electricity pricing. In: Proceedings of the 2014 IEEE 22nd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, Paris, France, 2014, 45: 305-314

[5] Xin Z, Reda S. Power budgeting techniques for data centers.IEEE Transactions on Computers, 2015, 64(8):2267-2278

[6] Zhang Y W, Wang Y, Wang X. TEStore: exploiting thermal and energy storage to cut the electricity bill for datacenter cooling. In: Proceedings of the 8th International Conference on Network and Service Management International Federation for Information Processing, Las Vegas, USA, 2012, 1:19-27

[7] Kaushik R T, Nahrstedt K. A data-centric cooling energy costs reduction approach for big data analytics cloud. In: Proceedings of the 2012 International Conference on High Performance Computing, Networking, Storage and Analysis, Madrid, Spain, 2012, 1: 1-11

[8] Li S, Le H, Pham N, et al. Joint optimization of computing and cooling energy: analytic model and a machine room case study. In: Proceedings of the International Conference on Distributed Computing Systems, Macau, China, 2012, 1: 396-405

[9] Neely M J. Stochastic network optimization with application to communication and queueing systems.Morgan & Claypool,2010, 4:53-62

Zhang Ran, is a graduate pursuing Ph.D in School of Mathematics and Statistics of Central South University. He received B.S. in applied mathematics from Branch Campus of Peking University in 1994 and M.S in computer science from Xidian University in 2007. His research interests focus on probability and applied statistics, machine learning, IDC resource management, and data mining.

10.3772/j.issn.1006-6748.2016.04.003

① Supported by the National Natural Science Foundation of China (No. 61502255), the Inner Mongolia Provincial Natural Science Foundation (No. 2014BS0607), and the Science Research Project for Inner Mongolia College (No. NJZY14064).

② To whom correspondence should be addressed. E-mail: seran_zhang@126.com Received on Oct. 9, 2015

猜你喜欢
陇西县陈旧内生
小区管理为何容易陷入“纷争”——手段陈旧是主因
中医药文化进校园的实施策略——以甘肃省陇西县为例
植物内生菌在植物病害中的生物防治
共享推动学前教育均衡发展——以陇西县巩昌幼儿园实施集团化办园探索为例
内生微生物和其在作物管理中的潜在应用
“党建+”激活乡村发展内生动力
授人以渔 激活脱贫内生动力
2017年7月26—27日陇西县暴雨天气过程分析
陈旧的谎言
放血疗法治疗陈旧热痤疮