Hybrid Two-Phase Task Allocation for Mobile Crowd Sensing

2022-03-12 05:56LIUJiahaoJINHanxinQIANGLeiGAOGuojuDUYangHUANGHe

计算机工程 2022年3期

LIU Jiahao，JIN Hanxin，QIANG Lei，GAO Guoju，DU Yang，HUANG He

（School of Computer Science and Technology，Soochow University，Suzhou，Jiangsu 215006，China）

【Abstract】As a result of the popularity of mobile devices，Mobile Crowd Sensing（MCS）has attracted a lot of attention.Task allocation is a significant problem in MCS.Most previous studies mainly focused on stationary spatial tasks while neglecting the changes of tasks and workers.In this paper，the proposed hybrid two-phase task allocation algorithm considers heterogeneous tasks and diverse workers.For heterogeneous tasks，there are different start times and deadlines.In each round，the tasks are divided into urgent and non-urgent tasks.The diverse workers are classified into opportunistic and participatory workers.The former complete tasks on their way，so they only receive a fixed payment as employment compensation，while the latter commute a certain distance that a distance fee is paid to complete the tasks in each round as needed apart from basic employment compensation.The task allocation stage is divided into multiple rounds consisting of the opportunistic worker phase and the participatory worker phase.At the start of each round，the hiring of opportunistic workers is considered because they cost less to complete each task.The Poisson distribution is used to predict the location that the workers are going to visit，and greedily choose the ones with high utility.For participatory workers，the urgent tasks are clustered by employing hierarchical clustering after selecting the tasks from the uncompleted task set.After completing the above steps，the tasks are assigned to participatory workers by extending the Kuhn-Munkres（KM）algorithm.The rest of the uncompleted tasks are non-urgent tasks which are added to the task set for the next round.Experiments are conducted based on a real dataset，Brightkite，and three typical baseline methods are selected for comparison.Experimental results show that the proposed algorithm has better performance in terms of total cost as well as efficiency under the constraint that all tasks are completed.

【Key words】Mobile Crowd Sensing（MCS）；two-phase task allocation；Kuhn-Munkres（KM）algorithm；opportunistic worker；participatory worker

0 Introduction

Nowadays，with the proliferation of smart phones，bracelets，and other mobile devices，Mobile Crowd Sensing（MCS）［1］has been widely used in traffic prediction［2］，air pollution monitoring［3］，and so on.As a critical part of MCS，the allocation of sensing tasks is extensively studied［4］.In the sensing task allocation stage，it needs to choose an appropriate strategy to allocate tasks to workers in order to reduce costs or improve data quality.In this paper，authors classify workers into opportunistic［5］and participatory workers［6］，and assign dynamic tasks（tasks with different start time and deadlines）in each round.To minimize the total cost under the constraint that all tasks are completed on time，three challenges must be addressed.First，how to accurately predict the path of opportunistic workers based on the past trajectory；Second，how to combine the task allocation of opportunistic workers and participatory workers organically；Third，how to find the appropriate workers to reduce the overall system cost.

To address the above challenges，this paper puts forward a reliable hybrid two-phase task allocation algorithm.Specifically，it focuses on the mixed task allocation problem in which all tasks should be completed before the deadline with a minimum cost.The proposed algorithm considers the opportunistic worker phase and the participatory worker phase in each round.In the former phase，this paper first predicts the probability of workers visiting each location，then introduces the greedy algorithm to choose a group of candidate workers based on geographical entropy.In the later phase，it divides uncompleted tasks into urgent tasks and non-urgent tasks.This paper employs hierarchical clustering［7］to cluster the urgent tasks and assigns them to participatory workers by extending the Kuhn-Munkres（KM）algorithm in the current round.Non-urgent tasks are added to the task set of next round，waiting for possible opportunistic workers to complete the tasks to save cost.

1 Related work

There has been large numbers of studies on the task allocation problem in MCS，which mainly focus on maximizing task quality or task coverage under the budget constraints［8］.YANG et al.proposed an efficient budgeted informativeness maximization algorithm to recruit workers and realized informativeness maximization under the budget constraint［9］.GONG et al.focused on the scenarios where users and tasks arrived dynamically and designed four online task assignment algorithms to maximize total task quality［10］.GUO et al.proposed a worker selection framework for time-sensitive tasks that seek to minimize the total distance and for delay-tolerant tasks that seek to minimize the total number of workers［11］.

Some studies also focused on selecting appropriate workers to minimize the overall system cost.XIAO et al.introduced a highly reliable multi-task allocation framework that considered the offline and online stage［12］.They combined the reverse auction mechanism with the network fl ow model to effectively minimize cost and maximize coverage.WANG et al.proposed a sparse mobile crowdsensing paradigm to reduce the number of tasks according to the spatial and temporal correlation among the data，thus lowering total sensing cost［13］.KANG et al.proposed a hitchhiking working style that encourages crowd workers to commit tasks on their way so as to expand space coverage while reducing costs［14］.

Most of the above studies have given incomplete views.Some of them consider spatial tasks but ignore time constraints，while some take deadline into account but neglect the dynamic changes of tasks and workers.In this paper，authors consider the real-time requirements and heterogeneity of sensing tasks with different time of arrival and deadlines.In addition，the active time of opportunistic workers and participatory workers may also be different.

2 Problem model

In this section，the task allocation problem is definesd and formulatesd，in order to minimize the total cost under the constraint of completing all tasks before each task expires.

2.1 Problem definition

In the mobile crowd sensing system，tasks are generally heterogeneous in reality.There may be shorttime tasks such as puddle detection，or non-urgent tasks such as data acquisition of old buildings in the city.Moreover，workers in the MCS are diverse.They can be busy employees，students with a fixed route，or part-time workers recruited by the platform.This paper studies how to assign heterogeneous tasks to mixed workers，in order to minimize the cost while ensuring that all tasks are completed.

In this paper，time is divided into numerous rounds，expressed asR={R1，R2，…，Rt}，in whichRkdenotes thekthround.In each round，there exists a variety of sensing tasksΓ={T1，T2，…，Tw}.Each taskTscan be represented asTs=，in whichlsdenotes the location ofTsconsisting of latitude and longitude，andrepresents the remaining survival time of the task from the start time of this round to deadline［15］.Only tasks that are completed before the deadline are actually effective.Sensing tasks are posted only at the beginning of each round，but their deadline can be any time during the round.

For workers in the MCS model，this paper divides them into two categories based on their historical activity patterns.One is opportunistic workers，denoted asO={o1，o2，…，om}.These workers have a relatively fixed path of movement each day and prefer to complete tasks along the way.For an opportunistic workeroi(1 ≤i≤m)，there may be a record of his/her trajectories in each past round.Besides，there is no upper limit on the number of tasks they can complete.The other category is participatory workers，denoted asP={p1，p2，…，pn}.These workers usually have variable routes，but they are willing to take a detour at any time to complete a certain number of tasks.Here authors assume that each participatory worker completes an upper limit ofηtasks in each round.For a participatory workerpj(1 ≤j≤n)，his/her position at the beginning of the current round，and the speedνjat which he/she can move toward the tasks can be gotten.

With the above definition of opportunistic and participatory workers，the roundRkcan be further divided.Since the active time of opportunistic workers is limited，this paper divides the roundRkinto two parts：the first part of the round is for opportunistic workers，denoted as，and the remaining time is for participatory workers，denoted as.Furthermore，the location ofpjat the beginning ofwill be gotten.The whole process of task allocation is shown in Fig.1.

Fig.1 Hybrid two-phase task allocation framework

The objective of this paper is to minimize the total cost［16］.Therefore，the costs of both types of workers need to be considered.For opportunistic workers，this paper denotes their individual cost asIbasic.For participatory workers，a certain amount for their detour must be paid which isIdisper kilameter.Besides，they also receive a fixed cost for each task completed，which is denoted asItask.On this basis，in roundRk，if a participatory workerpjmoves a total ofkm to completetasks，the cost is：

In summary，the total cost of employing workers in roundRkis：

whereSkrepresents the set of opportunistic workers and |Sk| represents the number of its elements.Pkrepresents the set of participatory workers in roundRk.

At the same time，the constraints need to be met that all tasks are completed by the deadline and participatory workers have an upper limit of tasks：

2.2 Task allocation model for opportunistic workers

For opportunistic workers，this paper predicts the future location ofoiaccording to the historical data，and gets the probability of reaching each location in the next round.

The probability ofoiarriving at locationLsforhtimes follows an inhomogeneous Poisson process［17］.Hence，the probability can be predicted thatoiwill arrive at theLsin the roundRk：

whereλi，sdenotes the average times foroito reachLs.

Based on the geographical entropy［18］，the priority of each task is：

The greater the weight of a location is，the lower the probability that it will be visited by participatory workers，and the higher the expected reward is required by participatory workers to complete tasks nearby.Since the cost of an opportunistic worker is certain，more cost can be saved by allocating the tasks to opportunistic workers.

The utility of each opportunity worker to overall system is defined as：

In the search of opportunistic workers，authors seek to find anoithat maximizes the overall increase in utility for which this paper sets a thresholdξ.A gain-based OW selection greedy algorithm is introduced，which is presented in Algorithm1.This paper greedily selects the opportunistic workers with the greatest utility increment until there is no opportunistic worker who can bring a utility increment greater thanξ.The set of selected opportunistic workers is denoted asSk.

Algorithm1Gain-based OW selection greedy algorithm

2.3 Task allocation model for participatory workers

Based on the set of candidate opportunistic workersSk，the actual set of uncompleted tasksat the end ofcan be gotten.Among them，whether a task is urgent can be calculated according to the following formula：

WhereTs∈，μrefers to the minimal remaining time for participatory workers to complete the task in the next round.A task will be considered urgent in two cases：1）The task will be terminated in the current round；2）The time left for the participatory workers in the next round is less than the minimal remaining timeμ.IfJk(Ts)=1，the task is supposed to be allocated during.Then a candidate task set for participatory workers will be gotten.First，this paper uses the hierarchical clustering algorithm to cluster the tasks so that a participatory worker can complete the tasks with a short distance to save the cost.The cluster task set is denoted as CTk，where |ct|≤η，ct ∈CTk.

Then this paper introduces the improved KM algorithm［19］to achieve task allocation in thephase.In this algorithm，participatory workerpjis selected as the left node of the bipartite graph，and cluster task ctias the right node.Since only the perfect bipartite graph is fit for KM algorithm，this algorithm introduces a set of virtual task nodes VΓk.The structure of the bipartite graph is shown in Fig.2，where the weights are set as follows：

Fig.2 Bipartite graph structure

The initial mark value of the left node is the maximal value of the weights for the edges connected to it.The initial mark value of the right node is 0.

Where match[cti]=-1，it denotes that the task ctidoesn’t have a match node or the value of match[cti]is the index of the matched worker.The function findPath is to find a next match for the worker and update the values of match.

With the KM-based PW-task match algorithm（as shown in Algorithm2），the result of task allocation can be obtained and its cost can be calculated.Finally，this paper estimates the total cost of both opportunistic workers and participatory workers through adding up their costs during each round.

Algorithm2KM-based PW-task match algorithm

3 Performance evaluation

In this section，the experimental results and analyses are presented.All experiments are carried out by Java on Windows platform，realized on an Intel®CoreTMi5-8265U CPU and 8 GB memory laptop computer.

3.1 Baseline methods

In order to verify the performance of the proposed algorithm，this paper selects three typical different baseline methods to carry out comparison experiments.To be specific，the baseline methods this paper selected include the KM algorithm with pure participatory workers（BasicKM），the Greedy-genetic algorithm，and the Hy-greedy algorithm based on the hybrid mode.BasicKM takes the participatory workers and tasks as the node sets，and constructs the inverse function of the distance between them to carry out the maximum weight matching.The Greedy-genetic algorithm［20］initializes the population by taking the greedy solution of task allocation as the initial solution.Hy-greedy algorithm greedily selects the opportunistic workers whose incremental utility is greater than the threshold，and assigns each remaining task to the nearest participatory worker to deal with.

3.2 Dataset processing

In the experiment，authors use the Brightkite dataset［4，21］.The dataset includes a total collection of 4 491 143 check-in information of users.This paper selects an area of 60 km×80 km near 40° N and 104°W，as shown in Fig.3，in which authors publish sensing tasks and recruit workers.Data of May 2009 in this region is for training，while the following ten days for test.This paper sets 8 a.m.to 5 p.m.the active time for opportunistic workers and 5 p.m.to 7 p.m.for participatory workers.Authors assume that for each task，there must exist a participatory worker who can get to the task in 2 h.During each round，this paper randomly generates sensing tasks within the area，and sets the deadline for tasks on the basis of standardized normal distribution.When workers reach a task within 1 km，it considers that the task has been completed.It selects workers with sufficient data in the test set as opportunistic workers.The speed of participatory workers is randomly classified into 7 km/h，20 km/h and 40 km/h.Since the worker data set is relatively sparse，this paper supplements part of the worker information appropriately according to the principle of normal distribution.In addition，it setsItask=10，Ibasic=20 andIdis=8.

Fig.3 Distribution of workers in the dataset

3.3 Experimental results

This paper uses the real dataset to compare the performance of the proposed algorithm with the baseline methods.The experimental results are shown in Fig.4.

Fig.4 Average cost of tasks vs.threshold for the incremental utility ξ

Fig.4（a）shows the change of the optimalξunder differentIbasic.It can be seen that the higher the cost of opportunistic workers is，the stricter the selection of opportunistic workers will be，and the larger the corresponding optimalξwill be.With the increase ofξ，the number of opportunistic workers decreases gradually，and the influence ofIbasicon the cost decreases，leading to gradual convergence of the curve.Fig.4（b）shows the change in cost caused byξunder different taskparticipant proportions.Whenξis small，the cost of selecting too many inappropriate opportunistic workers is high，while asξincreases，the cost drops rapidly.However，whenξis large，its increase will lead to the abandonment of suitable opportunistic workers，and the cost will gradually rise and finally reach the situation of pure participatory workers.

Fig.5 shows the performance of each algorithm under the condition of different numbers of participatory workers and tasks.It is illustrated that the proposed algorithm is much more efficient than Hy-greedy and BasicKM algorithm.And even the greedygenetic algorithm is at a disadvantage compared to the proposed algorithm.

Fig.5 Average cost of tasks vs.the number of participatory workers

In Fig.6，authors pick out the best-performing greedy-genetic algorithm to conduct the further comparison experiments.It can be seen that the proposed algorithm delivers better performance under various conditions than baseline methods.

Fig.6 The proposed algorithm vs.Greedy-genetic algorithm

4 Conclusion

This paper puts forward a reliable hybrid twophase task allocation algorithm.In the opportunistic workers phase，it predicts the probability of workers arriving at each location，and uses the greedy algorithm to calculate the utility to choose a group of candidate workers based on the geographical entropy.In the participatory workers phase，it employs hierarchical clustering to cluster the urgent tasks and assigns them to workers through the improved KM algorithm.Experiments based on the real-world dataset show that the proposed algorithm outperforms Hy-greedy，BasicKM and Greedy-genetic algorithms.

In this paper，the accurate geographic information of participants is assumed to be obtained in advance，which might not be achieved for privacy-sensitive participants［22-23］.In future work，authors will conduct further research on how to realize dynamic task allocation based on user privacy protection.

计算机工程2022年3期

计算机工程的其它文章: 基于袋外预测和扩展空间的随机森林改进算法; 基于主侧链合作的区块链访问控制策略; 基于预训练⁃微调策略的COVID⁃19 预测模型; 基于图卷积神经网络的智能路由算法; 融合实体类别信息的实体关系联合抽取; 基于数据更新间隔的NAND 闪存垃圾回收算法