jurafsky&martin_3rdEd_17 (1).pdf

288 wantsentence i want to fly go airports san

Info icon This preview shows pages 436–438. Sign up to view the full content.

View Full Document Right Arrow Icon
(28.8) Wantsentence (i want to [fly go]) Airports [(san francisco) denver] VoiceXML grammars allow semantic attachments, such as the text string ( "denver, colorado" ) the return for the City rule, or a slot/filler , like the attachments for the
Image of page 436

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
28.4 E VALUATING D IALOGUE S YSTEMS 437 TTS Performance Was the system easy to understand ? ASR Performance Did the system understand what you said? Task Ease Was it easy to find the message/flight/train you wanted? Interaction Pace Was the pace of interaction with the system appropriate? User Expertise Did you know what you could say at each point? System Response How often was the system sluggish and slow to reply to you? Expected Behavior Did the system work the way you expected it to? Future Use Do you think you’d use the system in the future? Figure 28.14 User satisfaction survey, adapted from Walker et al. (2001) . Flight rule which fills the slot ( <origin> or <destination> or both) with the value passed up in the variable x from the City rule. Because Fig. 28.13 is a mixed-initiative grammar, the grammar has to be ap- plicable to any of the fields. This is done by making the expansion for Flight a disjunction; note that it allows the user to specify only the origin city, the destination city, or both. 28.4 Evaluating Dialogue Systems Evaluation is crucial in dialog system design. If the task is unambiguous, we can simply measure absolute task success (did the system book the right plane flight, or put the right event on the calendar). To get a more fine-grained idea of user happiness, we can compute a user satis- faction rating , having users interact with a dialog system to perform a task and then having them complete a questionnaire. For example, Fig. 28.14 shows multiple- choice questions of the sort used by Walker et al. (2001) ; responses are mapped into the range of 1 to 5, and then averaged over all questions to get a total user satisfaction rating. It is often economically infeasible to run complete user satisfaction studies after every change in a system. For this reason, it is often useful to have performance evaluation heuristics that correlate well with human satisfaction. A number of such factors and heuristics have been studied. One method that has been used to classify these factors is based on the idea that an optimal dialog system is one that allows users to accomplish their goals (maximizing task success) with the least problems (minimizing costs). We can then study metrics that correlate with these two criteria. Task completion success: Task success can be measured by evaluating the cor- rectness of the total solution. For a frame-based architecture, this might be the per- centage of slots that were filled with the correct values or the percentage of subtasks that were completed. Interestingly, sometimes the user’s perception of whether they completed the task is a better predictor of user satisfaction than the actual task com- pletion success. (Walker et al., 2001) .
Image of page 437
Image of page 438
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern