We call our service composition model as multi

Info icon This preview shows pages 4–6. Sign up to view the full content.

, we call our service composition model as Multi-Objective Markov Decision Process based Web Service Composition ( MOMDP WSC ), which simply replaces the actions in a MOMDP with Web services. Definition 4: (MOMDP-Based Web Service Composition (MOMDP- WSC)) . An MOMDP-WSC is defined as a 6-tuple MOMDP WSC = < S i , s i 0 , S i r , A i ( . ) , P i , R i > , where S i is a finite set of world states observed by agent i ; s i 0 S is the initial state and any execution of the service composition usually starts from this state; S i r S is the set of terminal states. Upon arriving at one of those states, an execution of the service composition terminates; A i ( s ) is the set of Web services that can be executed in state s S i , a Web service ws belongs to A i , only if the precondition ws P is satisfied by s ; P i is the probability when a Web service ws A i ( s ) is invoked when agent i makes a transition from its current state s to a resulting state s , where the effect of ws is satisfied. For each s , the transition occurs with a probability P i ( s | s, ws ); and R i is a reward function when a Web service ws A i ( s ) is invoked, agent i makes a transition from s to s , and the service consumer receives an immediate reward r i , whose expected value is R i ( s | s, ws ). Consider selecting Web service ws with multiple QoS criteria, agent i receives the following reward vector: Q ( s, ws, s ) = [ Q 1 ( s, ws, s ) , Q 2 ( s, ws, s ) , · · · , Q M ( s, ws, s )] T , (3) where each Q i denotes a QoS attribute of ws . The solution to an MOMDP-WSC is a decision policy, which is defined as a procedure for service selection ws A by agent i in each state s . These policies, represented by π , are actually mappings from states to actions, defined as:
Image of page 4

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

302 A. Moustafa and M. Zhang π : S −→ A. (4) Each policy of MOMDP-WSC can define a single workflow, and therefore, the task of our service composition model is to identify the set of Pareto optimal policies that gives the best trade-offs among multiple QoS criteria. 3 Multi-Objective Reinforcement Learning for Service Composition In order to solve the above mentioned MOMDP, we propose an approach based on Multi-Objective Reinforcement Learning (MORL). The goal of MORL is to acquire the set of Pareto optimal policies in the MOMDP model. The set π p of the Pareto optimal policies is defined by: π p = π p Π π Π, s.t. π p ( s ) > p π ( s ) , s S , (5) where Π is the set of all policies and > p is the dominance relation. For two vectors a = ( a 1 , a 2 , · · · , a n ) and b = ( b 1 , b 2 , · · · , b n ), a > p b means that a i b i is satisfied for all i and a i > b i is satisfied for at least one i . Moreover, V π ( s ) = ( V π 1 ( s ) , V π 2 ( s ) , · · · , V π M ( s )) is the value vector of state s under policy π and it is defined by: V π ( s ) = E π k =0 γ k r t + k +1 s t = s , (6) where E π is the expected value provided that the agent follows policy π , s t is the state at time t , r t is the reward vector at t and γ is the discount rate parameter. We also define the Q-learning [20] vector by: Q π ( s, a ) = E π k =0 γ k r t + k +1 s t = s, a t = a , (7) where
Image of page 5
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern