View the step-by-step solution to:

Question

Suppose you have joined a search engine development team to design a search algorithm based on both the Vector

model and the Boolean model.

You have collected the following (3) documents (unstructured) and plan to apply an index technique to convert them into an inverted index.

Doc 1data science is a field to use scientific method, process, algorithm, system to extract knowledge.


Doc 2data mining is the process to discover pattern in large data to involve method at the database system.


Doc 3information system is the study of network of hardware and software that people use to process data. To answer the below questions, you have to provide the detailed procedures step by step.

Question 1.1: In the process of creating the inverted index, please complete the following steps:

Remove all stop words and punctuation.

The list of stop words for this task is provided as follows:

Is, An, That, Use, And, To, From, In, Both, Of, At,

The Question 1.:

Create a merged inverted list including the within-document frequencies for each term.

Question 1.: Use the index created as above to create a dictionary and the related posting file.

Question

1.4: Please design three Boolean queries, (e.g., web AND search) and list the relevant documents for each query. Each query must contain at least two keywords while no one keyword appears in one document only. Question 1.5: Please use the Vector model to query on the inverted index, and compare the result with the Boolean model.

(Hint: you can use cosine similarity and set a similarity threshold).

Recently Asked Questions

Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

  • -

    Study Documents

    Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

    Browse Documents
  • -

    Question & Answers

    Get one-on-one homework help from our expert tutors—available online 24/7. Ask your own questions or browse existing Q&A threads. Satisfaction guaranteed!

    Ask a Question
Let our 24/7 Computer Science tutors help you get unstuck! Ask your first question.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes
A+ icon
Ask Expert Tutors